skip to main content
article

Content-based video retrieval in historical collections of the German Broadcasting Archive

Published: 01 June 2019 Publication History

Abstract

The German Broadcasting Archive maintains the cultural heritage of radio and television broadcasts of the former German Democratic Republic (GDR). The uniqueness and importance of the video material fosters a large scientific interest in the video content. In this paper, we present a system for automatic video content analysis and retrieval to facilitate search in historical collections of GDR television recordings. It relies on a distributed, service-oriented architecture and includes video analysis algorithms for shot boundary detection, concept classification, person recognition, text recognition and similarity search. The combination of different search modalities allows users to obtain answers for a wide range of queries, leading to satisfactory results in short time. The performance of the system is evaluated using 2500 h of GDR television recordings.

References

[1]
Ahonen, T., Hadid, A., Pietikainen, M.: Face recognition with local binary patterns. In: Proceedings of the IEEE European Conference on Computer Vision. pp. 469---481 (2004)
[2]
Albertson, D., Ju, B.: Design criteria for video digital libraries: categories of important features emerging from users' responses. Online Inf. Rev. 39(2), 214---228 (2015)
[3]
Belhumeur, P.N., Kriegman, D.J.: Eigenfaces versus fisherfaces: recognition using class specific linear projection. IEEE Trans. Pattern Anal. Mach. Intell. 19(7), 711---720 (1997)
[4]
Breuel, T.M., Ul-Hasan, A., Al-Azawi, M.A., Shafait, F.: High-performance OCR for printed English and Fraktur using LSTM networks. In: Proceedings of the International Conference on Document Analysis and Recognition, pp. 683---687 (2013)
[5]
Chatfield, K., Simonyan, K., Vedaldi, A., Zisserman, A.: Return of the devil in the details: delving deep into convolutional nets. In: Proceedings of the British Machine Vision Conference, pp. 1---11 (2014)
[6]
Christel, M., Kanade, T., Mauldin, M., Reddy, R., Sirbu, M., Stevens, S., Wactlar, H.: Informedia digital video library. Commun. ACM 38(4), 57---58 (1995)
[7]
Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., Zheng, Y.: NUS-WIDE: A real-world web image database from National University of Singapore. In: Proceedings of the ACM International Conference on Image and Video Retrieval, pp. 48:1---48:9 (2009)
[8]
Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR '09), pp. 2---9 (2009)
[9]
Ewerth, R., Ballafkir, K., Mühling, M., Seiler, D., Freisleben, B.: Long-term incremental web-supervised learning of visual concepts via random savannas. IEEE Trans. Multimed. 14(4), 1008---1020 (2012)
[10]
Ewerth, R., Freisleben, B.: Video cut detection without thresholds. In: Proceedings of the 11th International Workshop on Signals, Systems and Image Processing (IWSSIP '04), pp. 227---230. Poznan, Poland (2004)
[11]
Ewerth, R., Freisleben, B.: Unsupervised detection of gradual video shot changes with motion-based false alarm removal. In: Proceedings of the 11th Conference on Advanced Concepts for Intelligent Vision Systems, pp. 253---264 (2009)
[12]
Ewerth, R., Mühling, M., Freisleben, B.: Self-supervised learning of face appearances in TV casts and movies. Int. J. Semant. Comput. 1(2), 185---204 (2007)
[13]
Ewerth, R., Mühling, M., Freisleben, B.: Robust video content analysis via transductive learning. ACM Trans. Intell. Syst. Technol. (TIST) 3(3), 1---26 (2011)
[14]
Ewerth, R., Schwalb, M., Tessmann, P., Freisleben, B.: Segmenting Moving Objects in MPEG Videos in the Presence of Camera Motion. In: Image Analysis and Processing, 2007. ICIAP 2007. 14th International Conference on IEEE, pp. 819---824 (2007)
[15]
Gllavata, J., Ewerth, R.: Text detection in images based on unsupervised classification of high-frequency wavelet coefficients. In: Proceedings of 17th International Conference on Pattern Recognition (ICPR '04), pp. 425---428. IEEE (2004)
[16]
Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep Convolutional Ranking for Multilabel Image Annotation. arXiv preprint arXiv:1312.4894 (2013)
[17]
Graves, A., Mohamed, A., Hinton, G.: Speech recognition with deep recurrent neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2013), pp. 6645---6649 (2013)
[18]
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770---778 (2016)
[19]
Hentschel, C., Blümel, I., Sack, H.: Automatic annotation of scientific video material based on visual concept detection. In: Proceedings of the 13th International Conference on Knowledge Management and Knowledge Technologies, p. 16 (2013)
[20]
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the ACM International Conference on Multimedia, pp. 675---678 (2014)
[21]
Krizhevsky, A., Hinton, G.: Using very deep Autoencoders for content-based image retrieval. In: Proceedings of the European Symposium on Artificial Neural Networks, pp. 1---7 (2011)
[22]
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1---9 (2012)
[23]
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Sov. Phys. Doklady. 10, 707---710 (1966)
[24]
Lin, K., Yang, H., Hsiao, J., Chen, C.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 27---35 (2015)
[25]
Liu, L., Wang, L., Liu, X.: In defense of soft-assignment coding. In: Proceedings of the 13th IEEE International Conference on Computer Vision, pp. 2486---2493 (2011)
[26]
Lowe, D.G.: Object recognition from local scale-invariant features. In: Proceedings of the 7th IEEE International Conference on Computer Vision, vol. 2, pp. 1150---1157 (1999)
[27]
Marchionini, G., Geisler, G.: The open video digital library. D-Lib. Mag. 8(12), 1082---9873 (2002)
[28]
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide-baseline stereo from maximally stable extremal regions. Image Vis. Comput. 22(10), 761---767 (2004)
[29]
Mühling, M., Markus, M., Ewerth, R., Freisleben, B.: Improving cross-domain concept detection via object-based features. In: Proceedings of the International Conference on Computer Analysis of Images and Patterns (CAIP '15) (2015)
[30]
Mühling, M., Ewerth, R., Freisleben, B.: On the spatial extents of SIFT descriptors for visual concept detection. In: Proceedings of the 8th International Conference on Computer Vision Systems, pp. 71---80. Springer (2011)
[31]
Mühling, M., Ewerth, R., Shi, B., Freisleben, B.: Multi-class object detection with hough forests using local histograms of visual words. In: Proceedings of 14th International Conference on Computer Analysis of Images and Patterns, pp. 386---393. Springer (2011)
[32]
Mühling, M., Ewerth, R., Zhou, J., Freisleben, B.: Multimodal video concept detection via bag of auditory words and multiple kernel learning. In: Proceedings of the 18th International Conference on Advances in Multimedia Modeling, pp. 40---50. Springer (2012)
[33]
Neumann, L., Matas, J.: Real-time scene text localization and recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1---8 (2012)
[34]
Sack, H., Plank, M.: AV-Portal: The German National Library of Science and Technology's Semantic Video Portal. ERCIM News 96 (2014)
[35]
Salakhutdinov, R., Hinton, G.: Semantic hashing. Int. J. Approx. Reason. 50(7), 969---978 (2009)
[36]
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 815---823 (2015)
[37]
Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern. Anal. Mach. Intell. 22(12), 1349---1380 (2000)
[38]
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1---9 (2015)
[39]
Taigman, Y., Yang, M., Ranzato, M., Wolf, L.: DeepFace: Closing the gap to human-level performance in face verification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1---8 (2014)
[40]
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 511---518 (2001)
[41]
Wan, J., Wang, D., Hoi, S.C.H., Wu, P.: Deep learning for content-based image retrieval: a comprehensive study. In: Proceedings of the ACM International Conference on Multimedia, pp. 157---166 (2014)
[42]
Yianilos, P.N.: Data structures and algorithms for nearest neighbor search in general metric spaces. In: Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 311---321 (1993)
[43]
Zhou, B., Lapedriza, A., Xiao, J., Torralba, A., Oliva, A.: Learning deep features for scene recognition using places database. Adv. Neural Inf. Process. Syst. 27, 487---495 (2014)

Cited By

View all
  1. Content-based video retrieval in historical collections of the German Broadcasting Archive

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image International Journal on Digital Libraries
    International Journal on Digital Libraries  Volume 20, Issue 2
    June 2019
    79 pages
    ISSN:1432-5012
    EISSN:1432-1300
    Issue’s Table of Contents

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 June 2019

    Author Tags

    1. Automatic content-based video analysis
    2. Content-based video retrieval
    3. Deep learning
    4. German Broadcasting Archive

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 29 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    View options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media