skip to main content
article

Measuring novelty and redundancy with multiple modalities in cross-lingual broadcast news

Published: 01 June 2008 Publication History

Abstract

News videos from different channels, languages are broadcast everyday, which provide abundant information for users. To effectively search, retrieve, browse and track news stories, news story similarity plays a critical role in assessing the novelty and redundancy among news stories. In this paper, we explore different measures of novelty and redundancy detection for cross-lingual news stories. A news story is represented by multimodal features which include a sequence of keyframes in the visual track, and a set of words and named entities extracted from speech transcript in the audio track. Vector space models and language models on individual features (text, named entities and keyframes) are constructed to compare the similarity among stories. Furthermore, multiple modalities are further fused to improve the performance. Experiments on the TRECVID-2005 cross-lingual news video corpus showed that modalities and measures demonstrate variant performance for novelty and redundancy detection. Language models on text are appropriate for detecting completely redundant stories, while Cosine Distance on keyframes is suitable for detecting somewhat redundant stories. The performance on mono-lingual topics is better than multilingual topics. Textual features and visual features complement each other, and fusion of text, named entities and keyframes substantially improves the performance, which outperforms approaches with just individual features.

References

[1]
T.S. Chua, S.F. Chang, L. Chaison, W. Hsu, Story boundary detection in large broadcast news video archives - techniques, experience and trends, in: ACM MM'04, 2004, pp. 656-659.
[2]
In: Allan, J. (Ed.), Topic Detection and Tracking: Event-based Information Organization, Kluwer Academic Publishers, Boston.
[3]
S. Brin, J. Davis, H.G. Molina, Copy detection mechanisms for digital documents, in: ACM SIGMOD'95, 1995, pp. 298-409.
[4]
Y. Zhang, J. Callan, T. Minka, Novelty and redundancy detection in adaptive filtering, in: ACM SIGIR'02, 2002.
[5]
T. Brants, F. Chen, A. Farahat, A system for new event detection, in: SIGIR'03, Canada, July 2003.
[6]
J. Allan, C. Wade, A. Bolivar, Retrieval and novelty detection at the sentence level, in: ACM SIGIR'03, Canada, July 2003, pp. 314-321.
[7]
D. Metzler, Y. Bernstein, W. Croft, et al., Similarity measures for tracking information flow, in: CIKM'05, Germany, October 2005.
[8]
E. Gabrilovich, S. Dumais, E. Horvitz, Newsjunkie: providing personalized newsfeeds via analysis of information novelty, in: WWW'04, USA, 2004, pp. 482-490.
[9]
X. Li, W.B. Croft, Novelty detection on sentence level pattern, in: CIKM'05, Germany, October 2005.
[10]
Y. Yang, J. Zhang, J. Carbonell, C. Jin, Topic-conditioned novelty detection, in: SIGKDD'02, Canada, 2002.
[11]
L.S. Larkey, F. Feng, M. Connell, V. Lavrenko, Language-specific models in multilingual topic tracking, in: SIGIR'04, UK, July 2003.
[12]
Cheung, S.C. and Zakhor, A., Efficient video similarity measurement with video signature. IEEE Trans. Circuits Syst. Video Technol. v13 i1. 59-74.
[13]
Jain, A.K., Vailaya, A. and Xiong, W., Query by video clip. ACM Multimedia Syst. J. v7. 369-384.
[14]
Peng, Y. and Ngo, C.-W., Clip-based similarity measure for query-dependent clip retrieval and video summarization. IEEE Trans. Circuits Syst. Video Technol. v16 i5. 612-627.
[15]
D.-Q. Zhang, S.-F. Chang, Detecting image near-duplicate by stochastic attributed relational graph matching with learning, in: ACM MM'04, USA, October 2004.
[16]
P. Duygulu, J.-Y. Pan, D.A. Forsyth, Towards auto-documentary: tracking the evolution of news stories, in: ACM MM'04, USA, October 2004, pp. 820-827.
[17]
Y. Ke, R. Sukthankar, L. Huston, Efficient near-duplicate detection and sub-image retrieval, in: ACM MM'04, USA, October 2004, pp. 869-876.
[18]
C.-W. Ngo, W.-L. Zhao, Y.-G. Jiang, Fast tracking of near-duplicate keyframes in broadcast domain with transitivity propagation, in: ACM International Conference on Multimedia (ACM MM'06), Santa Barbara, CA, USA, October 23-27, 2006.
[19]
W.H. Hsu, S.-F. Chang, Topic tracking across broadcast news videos with visual duplicates and semantic concepts, in: The International Conference on Image Processing (ICIP'06), Atlanta, GA, October 2006.
[20]
S.-F. Chang, et al., Columbia University TRECVID-2005 video search and high-level feature extraction, TRECVID 2005, Washington DC, 2005.
[21]
Zhu, X., Fan, J., Elmagarmid, A.K. and Wu, X., Hierarchical video content description and summarization using unified semantic and visual similarity. Multimedia Syst. v9 i1. 31-53.
[22]
Y. Zhai, M. Shah, Tracking news stories across different sources, in: 13th ACM Annual Conference on Multimedia (ACM MM'05), Singapore, November, 2005.
[23]
Wu, X., Ngo, C.-W. and Li, Q., Threading and autodocumenting news videos. IEEE Signal Process. Mag. v23 i2. 59-68.
[24]
C. Zhai, J. Lafferty, A study of smoothing methods for language models applied to ad hoc information retrieval, in: SIGIR'01, USA, September 2001, pp. 334-342.
[25]
Y. Zhang, W. Xu, J. Callan, Exact maximum likelihood estimation for word mixtures, Text Learning Workshop at the International Conference on Machine Learning (ICML), 2002.
[26]
TRECVID 2005 {online}, Available from: <http://www-nlpir.nist.gov/projects/trecvid/>.
[27]
CMU Informadia Project, Available from: <http://www.informedia.cs.cmu.edu/>.
[28]
Gauvain, J.L., Lamel, L. and Adda, G., The LIMSI broadcast news transcription system. Speech Commun. v37 i1-2. 89-108.
[29]
Google translation {online}, Available from: <http://translate.google.com/>.
[30]
Lowe, D., Distinctive image features from scale-invariant key points. Int. J. Comput. Vis. v60. 91-110.
[31]
X. Wu, W.-L. Zhao, C.-W, Ngo. Near-duplicate keyframe retrieval with visual keywords and semantic context, in: ACM International Conference on Image and Video Retrieval (ACM CIVR'07), The Netherlands, July 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Computer Vision and Image Understanding
Computer Vision and Image Understanding  Volume 110, Issue 3
June, 2008
124 pages

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 June 2008

Author Tags

  1. Cross-lingual
  2. Multimodality
  3. Near-duplicate keyframe
  4. News videos
  5. Novelty
  6. Redundancy detection
  7. Similarity measure

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media