article

Content-based video retrieval in historical collections of the German Broadcasting Archive

Authors:

Markus Mühling,

Nikolaus Korfhage,

Angelika Hörth,

Bernd FreislebenAuthors Info & Claims

International Journal on Digital Libraries, Volume 20, Issue 2

Pages 167 - 183

https://doi.org/10.1007/s00799-018-0236-z

Published: 01 June 2019 Publication History

Abstract

The German Broadcasting Archive maintains the cultural heritage of radio and television broadcasts of the former German Democratic Republic (GDR). The uniqueness and importance of the video material fosters a large scientific interest in the video content. In this paper, we present a system for automatic video content analysis and retrieval to facilitate search in historical collections of GDR television recordings. It relies on a distributed, service-oriented architecture and includes video analysis algorithms for shot boundary detection, concept classification, person recognition, text recognition and similarity search. The combination of different search modalities allows users to obtain answers for a wide range of queries, leading to satisfactory results in short time. The performance of the system is evaluated using 2500 h of GDR television recordings.

References

[1]

Ahonen, T., Hadid, A., Pietikainen, M.: Face recognition with local binary patterns. In: Proceedings of the IEEE European Conference on Computer Vision. pp. 469---481 (2004)

[2]

Albertson, D., Ju, B.: Design criteria for video digital libraries: categories of important features emerging from users' responses. Online Inf. Rev. 39(2), 214---228 (2015)

[3]

Belhumeur, P.N., Kriegman, D.J.: Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711---720 (1997)

Digital Library

[4]

Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 683---687 (2013)

Digital Library

[5]

Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British Machine Vision Conference, pp. 1---11 (2014)

[6]

Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., Wactlar, H.: Informedia digital video library. Commun. ACM 38(4), 57---58 (1995)

Digital Library

[7]

Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48:1---48:9 (2009)

Digital Library

[8]

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 2---9 (2009)

[9]

Ewerth, R., Ballafkir, K., Mühling, M., Seiler, D., Freisleben, B.: Long-term incremental web-supervised learning of visual concepts via random savannas. IEEE Trans. Multimed. 14(4), 1008---1020 (2012)

Digital Library

[10]

Ewerth, R., Freisleben, B.: Video cut detection without thresholds. In: Proceedings of the 11th International Workshop on Signals, Systems and Image Processing (IWSSIP '04), pp. 227---230. Poznan, Poland (2004)

[11]

Ewerth, R., Freisleben, B.: Unsupervised detection of gradual video shot changes with motion-based false alarm removal. In: Proceedings of the 11th Conference on Advanced Concepts for Intelligent Vision Systems, pp. 253---264 (2009)

[12]

Ewerth, R., Mühling, M., Freisleben, B.: Self-supervised learning of face appearances in TV casts and movies. Int. J. Semant. Comput. 1(2), 185---204 (2007)

[13]

Ewerth, R., Mühling, M., Freisleben, B.: Robust video content analysis via transductive learning. ACM Trans. Intell. Syst. Technol. (TIST) 3(3), 1---26 (2011)

Digital Library

[14]

Ewerth, R., Schwalb, M., Tessmann, P., Freisleben, B.: Segmenting Moving Objects in MPEG Videos in the Presence of Camera Motion. In: Image Analysis and Processing, 2007. ICIAP 2007. 14th International Conference on IEEE, pp. 819---824 (2007)

Digital Library

[15]

Gllavata, J., Ewerth, R.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of 17th International Conference on Pattern Recognition (ICPR '04), pp. 425---428. IEEE (2004)

Digital Library

[16]

Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep Convolutional Ranking for Multilabel Image Annotation. arXiv preprint arXiv:1312.4894 (2013)

[17]

Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 6645---6649 (2013)

[18]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770---778 (2016)

[19]

Hentschel, C., Blümel, I., Sack, H.: Automatic annotation of scientific video material based on visual concept detection. In: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies, p. 16 (2013)

Digital Library

[20]

Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675---678 (2014)

Digital Library

[21]

Krizhevsky, A., Hinton, G.: Using very deep Autoencoders for content-based image retrieval. In: Proceedings of the European Symposium on Artificial Neural Networks, pp. 1---7 (2011)

[22]

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1---9 (2012)

Digital Library

[23]

Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Doklady. 10, 707---710 (1966)

[24]

Lin, K., Yang, H., Hsiao, J., Chen, C.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27---35 (2015)

[25]

Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of the 13th IEEE International Conference on Computer Vision, pp. 2486---2493 (2011)

Digital Library

[26]

Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2, pp. 1150---1157 (1999)

Digital Library

[27]

Marchionini, G., Geisler, G.: The open video digital library. D-Lib. Mag. 8(12), 1082---9873 (2002)

[28]

Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761---767 (2004)

[29]

Mühling, M., Markus, M., Ewerth, R., Freisleben, B.: Improving cross-domain concept detection via object-based features. In: Proceedings of the International Conference on Computer Analysis of Images and Patterns (CAIP '15) (2015)

Digital Library

[30]

Mühling, M., Ewerth, R., Freisleben, B.: On the spatial extents of SIFT descriptors for visual concept detection. In: Proceedings of the 8th International Conference on Computer Vision Systems, pp. 71---80. Springer (2011)

Digital Library

[31]

Mühling, M., Ewerth, R., Shi, B., Freisleben, B.: Multi-class object detection with hough forests using local histograms of visual words. In: Proceedings of 14th International Conference on Computer Analysis of Images and Patterns, pp. 386---393. Springer (2011)

Digital Library

[32]

Mühling, M., Ewerth, R., Zhou, J., Freisleben, B.: Multimodal video concept detection via bag of auditory words and multiple kernel learning. In: Proceedings of the 18th International Conference on Advances in Multimedia Modeling, pp. 40---50. Springer (2012)

Digital Library

[33]

Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1---8 (2012)

Digital Library

[34]

Sack, H., Plank, M.: AV-Portal: The German National Library of Science and Technology's Semantic Video Portal. ERCIM News 96 (2014)

[35]

Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969---978 (2009)

Digital Library

[36]

Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815---823 (2015)

[37]

Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern. Anal. Mach. Intell. 22(12), 1349---1380 (2000)

Digital Library

[38]

Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1---9 (2015)

[39]

Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1---8 (2014)

Digital Library

[40]

Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511---518 (2001)

[41]

Wan, J., Wang, D., Hoi, S.C.H., Wu, P.: Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the ACM International Conference on Multimedia, pp. 157---166 (2014)

Digital Library

[42]

Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 311---321 (1993)

Digital Library

[43]

Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst. 27, 487---495 (2014)

Digital Library

Cited By

Parian-Scherb MUhrig PRossetto LDupont SSchuldt H(2024)Gesture retrieval and its application to the study of multimodal communicationInternational Journal on Digital Libraries10.1007/s00799-023-00367-025:4(585-601)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s00799-023-00367-0
Neuschmied HWinter M(2023)Explainable face verification for video archive documentationProceedings of the 20th International Conference on Content-based Multimedia Indexing10.1145/3617233.3617254(104-110)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3617233.3617254
Neuschmied HThallinger GBailer WFröschl G(2023)ÖWF-OD: A Dataset for Object Detection in Archival Film ContentProceedings of the 20th International Conference on Content-based Multimedia Indexing10.1145/3617233.3617253(92-96)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3617233.3617253
Show More Cited By

Content-based video retrieval in historical collections of the German Broadcasting Archive
1. Information systems

Recommendations

Content-Based Analysis Improves Audiovisual Archive Retrieval
Part 2

Content-based video retrieval is maturing to the point where it can be used in real-world retrieval practices. One such practice is the audiovisual archive, whose users increasingly require fine-grained access to broadcast television content. In this ...
Content-based video retrieval and compression: a unified solution
ICIP '97: Proceedings of the 1997 International Conference on Image Processing (ICIP '97) 3-Volume Set-Volume 1 - Volume 1

Video compression and retrieval have been treated as separate problems in the past. We present an object-based video representation that facilitates both compression and retrieval. Typically in retrieval applications, a video sequence is subdivided in ...
Efficient content-based video retrieval by mining temporal patterns
MDM '08: Proceedings of the 9th International Workshop on Multimedia Data Mining: held in conjunction with the ACM SIGKDD 2008

In recent years, multimedia content processing has become a hot topic with the rapid development of information technology and popularity of World Wide Web. Among the emerging research topics, content-based video retrieval is an attractive and ...

Comments

Information & Contributors

Information

Published In

cover image International Journal on Digital Libraries

International Journal on Digital Libraries Volume 20, Issue 2

June 2019

79 pages

ISSN:1432-5012

EISSN:1432-1300

Issue’s Table of Contents

Copyright © Copyright © 2019 Springer-Verlag GmbH Germany, part of Springer Nature.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 June 2019

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Parian-Scherb MUhrig PRossetto LDupont SSchuldt H(2024)Gesture retrieval and its application to the study of multimodal communicationInternational Journal on Digital Libraries10.1007/s00799-023-00367-025:4(585-601)Online publication date: 1-Dec-2024
https://dl.acm.org/doi/10.1007/s00799-023-00367-0
Neuschmied HWinter M(2023)Explainable face verification for video archive documentationProceedings of the 20th International Conference on Content-based Multimedia Indexing10.1145/3617233.3617254(104-110)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3617233.3617254
Neuschmied HThallinger GBailer WFröschl G(2023)ÖWF-OD: A Dataset for Object Detection in Archival Film ContentProceedings of the 20th International Conference on Content-based Multimedia Indexing10.1145/3617233.3617253(92-96)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3617233.3617253
Mühling MKorfhage NPustu-Iren KBars JKnapp MBellafkir HVogelbacher MSchneider DHörth AEwerth RFreisleben B(2022)VIVA: visual information retrieval in video archivesInternational Journal on Digital Libraries10.1007/s00799-022-00337-y23:4(319-333)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s00799-022-00337-y

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents