skip to main content
10.1145/1027527.1027679acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Story boundary detection in large broadcast news video archives: techniques, experience and trends

Published: 10 October 2004 Publication History

Abstract

The segmentation of news video into story units is an important step towards effective processing and management of large news video archives. In the story segmentation task in TRECVID 2003, a wide variety of techniques were employed by many research groups to segment over 120-hour of news video. The techniques employed range from simple anchor person detector to soisticated machine learning models based on HMM and Maximum Entropy (ME) approaches. The general results indicate that the judicious use of multi-modality features coupled with rigorous machine learning models could produce effective solutions. This paper presents the algorithms and experience learned in TRECVID evaluations. It also points the way towards the development of scalable technology to process large news video corpuses.

References

[1]
J. Allan, J. Carbonell, G. Doddington, J. Yamron & Y. Yang (1998). Topic detection and tracking pilot study final report. Proceedings of DARPA Broadcast News Transcription and Understanding Workshop. 194--218.
[2]
P. Browne, C. Czirjek, G. Gaughan, C. Gurrin, G. J.F. Jones, H. Lee, S. Marlow, K. Mac Donald, N. Mury, N. E. O'Connor, N. Hare, A. F. Smeaton, & J. Ye (2003). Dublin City University video track experiments for TREC 2003. Notebook submitted to TRECVID 2003.
[3]
L. Chaisorn, T.-S Chua, C.-K Koh, Y.-L Zhao, H. Xu, H. Feng & Q. Tian (2003). A two-level multi-modal approach for story segmentation of large news video corpus, Proceedings of TRECVID workshop 2003.
[4]
D. Eichmann & D.-J. Park (2003). Experiments in boundaries recognition at the University of Iowa. Notebook submitted to TRECVID 2003.
[5]
M. Franz, J. S. McCarley, S. Roukos, T. Ward, and W.-J. Zhu,"Segmentation and detection at IBM: Hybrid statistical models and two-tiered clustering broadcast news domain," in Proceedings of TDT-3 Workshop, 2000.
[6]
J.L. Gauvain, L. Lamel & G. Adda (2002). The LIMSI broadcast news transcription system. Speech Communication, 37(1-2):89--108.
[7]
M.A. Hearst (1994). Multi-paragra segmentation of expository text. Proc of the 32nd Annual Meeting of the Association for Computational Linguistics.
[8]
W. Hsu, S.-F. Chang, C.-W. Huang, L. Kennedy, C.-Y. Lin, and G. Iyengar, "Discovery and fusion of salient multi-modal features towards news story segmentation," in IS&T/SPIE Electronic Imaging, San Jose, CA, 2004.
[9]
W. Hsu and S.-F. Chang, "Generative, Discriminative, and Ensemble Learning on Multi-modal Perceptual Fusion toward News Video Story Segmentation," IEEE International Conference on Multimedia and Expo, Taipei, Taiwan, June 27-30, 2004.
[10]
P. Rennert (2003). StreameSage unsupervised ASR-based topic segmentation. Proceedings of TRECVID workshop.
[11]
J.C. Reynar (1994). An automatic method of finding topic boundaries. Proc. of the 32nd Annual Meeting of the Association for Computational Linguistics.
[12]
TREC Video Retrieval Evaluation (2003), http://www-nlpir.nist.gov/projects/tv2003/tv2003.html. Washington D.C., 17-18 Nov, 2003.
[13]
J. Vaissiere, "Language-independent prosodic features," in Prosody: Models and Measurements, Anne Cutler and D. Robert Ladd, Eds., pp. 53--66. Springer, Berlin, 1983.
[14]
L. Wu, Y. Guo, X. Qiu, Z. Feng, J. Rong, W. Jin, D. Zhou, R. Wang & M. Jing (2003). Fudan University at TRECVID 2003. Notebook submitted to TRECVID 2003.
[15]
Y. Zhai, Z. Rasheed & M. Shah (2003). Univerisity of Central Florida at TRECVID 2003. Notebook submitted to TRECVID 2003.
[16]
Zugano, K. Hoashi, K. Mutsumato, F. Sugaya & Y. Nakajima (2003). Shot boundary determination on MPEG compressed domain and story segmentation experiments for TRECVID 2003. Notebook in TRECVID 2003.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MULTIMEDIA '04: Proceedings of the 12th annual ACM international conference on Multimedia
October 2004
1028 pages
ISBN:1581138938
DOI:10.1145/1027527
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. machine learning techniques
  2. news video
  3. story segmentation

Qualifiers

  • Article

Conference

MM04

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media