skip to main content
10.1145/2483977.2483988acmconferencesArticle/Chapter ViewAbstractPublication PagesmmsysConference Proceedingsconference-collections
research-article

Blip10000: a social video dataset containing SPUG content for tagging and retrieval

Published: 28 February 2013 Publication History

Abstract

The increasing amount of digital multimedia content available is inspiring potential new types of user interaction with video data. Users want to easily find the content by searching and browsing. For this reason, techniques are needed that allow automatic categorisation, searching the content and linking to related information. In this work, we present a dataset that contains comprehensive semi-professional user-generated (SPUG) content, including audiovisual content, user-contributed metadata, automatic speech recognition transcripts, automatic shot boundary files, and social information for multiple 'social levels'. We describe the principal characteristics of this dataset and present results that have been achieved on different tasks.

References

[1]
J. Almeida, T. Salles, E. Martins, O. Penatti, R. da S. Torres, M. Gonçalves, and J. Almeida. UNICAMP-UFMG at MediaEval 2012: Genre Tagging Task. In Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR-WS.org, ISSN 1613-0073, October 4-5 2012.
[2]
M. Eskevich, G. J. Jones, S. Chen, R. Aly, R. Ordelman, and M. A. Larson. Search and Hyperlinking Task at MediaEval 2012. In Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR-WS.org, ISSN 1613-0073, October 4-5 2012.
[3]
B. Ionescu, I. Mironica, K. Seyerlehner, P. Knees, J. Schlüter, M. Schedl, A. B. Horia Cucu, and P. Lambert. ARF @ MediaEval 2012: Multimodal Video Classification. In Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR-WS.org, ISSN 1613-0073, October 4-5 2012.
[4]
Y.-G. Jiang, G. Ye, S.-F. Chang, D. Ellis, and A. C. Loui. Consumer video understanding: a benchmark database and an evaluation of human and machine performance. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR '11, pages 29:1--29:8, 2011.
[5]
P. Kelm, S. Schmiedeke, and T. Sikora. Feature-based Video Key Frame Extraction for low Quality Video Sequences. In 10th Workshop on Image Analysis for Multimedia Interactive Services, 2009.
[6]
L. Lamel and J.-L. Gauvain. Speech processing for audio indexing. In B. Nordström and A. Ranta, editors, Advances in Natural Language Processing, volume 5221 of Lecture Notes in Computer Science, pages 4--15. Springer Berlin Heidelberg, 2008.
[7]
M. Larson, M. Soleymani, P. Serdyukov, S. Rudinac, C. Wartena, V. Murdock, G. Friedland, R. Ordelman, and G. J. F. Jones. Automatic tagging and geotagging in video collections and communities. In Proceedings of the 1st ACM International Conference on Multimedia Retrieval, ICMR '11, pages 51:1--51:8, New York, NY, USA, 2011. ACM.
[8]
L. Y. Meng Wang and X.-S. Hua. MSRA-MM: Bridging Research and Industrial Societies for Multimedia Information Retrieval. In TechReport: MSR-TR-2009-30, 2008.
[9]
M. Naaman. Social multimedia: highlighting opportunities for search and mining of multimedia data in social media applications. Multimedia Tools Appl., 56(1):9--34, Jan. 2012.
[10]
P. Over, G. Awad, M. Michel, J. Fiscus, G. Sanders, B. Shaw, W. Kraaij, A. F. Smeaton, and G. Quéenot. TRECVID 2012 -- An Overview of the Goals, Tasks, Data, Evaluation Mechanisms and Metrics. In Proceedings of TRECVID 2012. NIST, USA, 2012.
[11]
K. K. Reddy and M. Shah. Recognizing 50 Human Action Categories of Web Videos. In Machine Vision and Applications Journal (MVAP), 2012.
[12]
A. Rousseau, F. Bougares, P. Deléglise, H. Schwenk, and Y. Estève. LIUM's systems for the IWSLT 2011 Speech Translation Tasks. In International Workshop on Spoken Language Translation, San Francisco (USA), 8-9 Sept 2011.
[13]
S. Schmiedeke, P. Kelm, and T. Sikora. TUB @ MediaEval 2012 Tagging Task: Feature Selection Methods for Bag-of-(visual)-Words Approaches. In Working Notes Proceedings of the MediaEval 2012 Workshop.
[14]
S. Schmiedeke, C. Kofler, and I. Ferrané. Overview of the MediaEval 2012 Tagging Task. In Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR-WS.org, ISSN 1613-0073, October 4-5 2012.
[15]
T. Semela, M. Tapaswi, H. K. Ekenel, and R. Stiefelhagen. KIT at MediaEval 2012 - Content-based Genre Classification with Visual Cues. In Working Notes Proceedings of the MediaEval 2012 Workshop.
[16]
Y. Shi, M. A. Larson, P. Wiggers, and C. M. Jonker. MediaEval 2012 Tagging Task: Prediction based on One Best List and Confusion Networks. In Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR-WS.org, ISSN 1613-0073, October 4-5 2012.
[17]
X. Wu, A. G. Hauptmann, and C.-W. Ngo. Practical elimination of near-duplicates from web video search. In Proceedings of the 15th international conference on Multimedia, MULTIMEDIA '07, pages 218--227, New York, NY, USA, 2007. ACM.
[18]
P. Xu, Y. Shi, and M. A. Larson. TUD at MediaEval 2012 genre tagging task: Multi-modality video categorization with one-vs-all classifiers. In Working Notes Proceedings of the MediaEval 2012 Workshop. CEUR-WS.org, ISSN 1613-0073, October 4-5 2012.
[19]
S. Zanetti, L. Zelnik-manor, and P. Perona. A walk through the web's video clips. In In: IEEE Workshop on Internet Vision, associated with CVPR, 2008.

Cited By

View all

Index Terms

  1. Blip10000: a social video dataset containing SPUG content for tagging and retrieval

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MMSys '13: Proceedings of the 4th ACM Multimedia Systems Conference
      February 2013
      304 pages
      ISBN:9781450318945
      DOI:10.1145/2483977
      • General Chair:
      • Carsten Griwodz
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 February 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. SPUG content
      2. dataset
      3. speech retrieval
      4. video tagging

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      MMSys '13: Multimedia Systems Conference 2013
      February 28 - March 1, 2013
      Oslo, Norway

      Acceptance Rates

      MMSys '13 Paper Acceptance Rate 15 of 63 submissions, 24%;
      Overall Acceptance Rate 176 of 530 submissions, 33%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 23 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media