skip to main content
10.1145/3206025.3206051acmconferencesArticle/Chapter ViewAbstractPublication PagesicmrConference Proceedingsconference-collections
research-article

Scene Text Detection and Tracking in Video with Background Cues

Published: 05 June 2018 Publication History

Abstract

To detect scene text in the video is valuable to many content-based video applications. In this paper, we present a novel scene text detection and tracking method for videos, which effectively exploits the cues of the background regions of the text. Specifically, we first extract text candidates and potential background regions of text from the video frame. Then, we exploit the spatial, shape and motional correlations between the text and its background region with a bipartite graph model and the random walk algorithm to refine the text candidates for improved accuracy. We also present an effective tracking framework for text in the video, making use of the temporal correlation of text cues across successive frames, which contributes to enhancing both the precision and the recall of the final text detection result. Experiments on public scene text video datasets demonstrate the state-of-the-art performance of the proposed method.

References

[1]
Katherine L. Bouman, Golnaz Abdollahian, Mireille Boutin, and Edward J. Delp. 2011. A Low Complexity Sign Detection and Text Localization Method for Mobile Applications. IEEE Transactions on Multimedia Vol. 13, 5 (Oct. . 2011), 922--934.
[2]
Xiangrong Chen and Alan L. Yuille. 2004. Detecting and reading text in natural scenes. In 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Vol. Vol. 2. II-366-II-373 Vol.2. opersonChangsong Liu, and Xiaoqing Ding. 2013. A research on Video text tracking and recognition. Proceedings of SPIE Vol. 8664 (2013), 8664-8664-10.
[3]
Kai Wang, Boris Babenko, and Serge Belongie. 2011. End-to-End Scene Text Recognition. In 2011 International Conference on Computer Vision. 1457--1464.
[4]
Christian Wolf, Jean-Michel Jolion, and Francoise Chassaing. 2002. Text Localization, Enhancement and Binarization in Multimedia Documents 16th International Conference on Pattern Recognition, Vol. Vol. 2. 1037--1040 vol.2.
[5]
Liang Wu, Palaiahnakote Shivakumara, Tong Lu, and Chew Lim Tan. 2015. A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video. IEEE Transactions on Multimedia Vol. 17, 8 (Aug . 2015), 1137--1152.
[6]
Hailiang Xu and Feng Su. 2015. Robust Seed Localization and Growing with Deep Convolutional Features for Scene Text Detection. In 2015 5th ACM International Conference on Multimedia Retrieval (ICMR 2015). 387--394.
[7]
Chun Yang, Xu-Cheng Yin, Wei-Yi Pei, Shu Tian, Ze-Yu Zuo, Chao Zhu, and Junchi Yan. 2017. Tracking Based Multi-Orientation Scene Text Detection: A Unified Framework With Dynamic Programming. IEEE Transactions on Image Processing Vol. 26, 7 (July. 2017), 3235--3248.
[8]
Xu-Cheng Yin, Xuwang Yin, Kaizhu Huang, and Hong-Wei Hao. 2014. Robust Text Detection in Natural Scene Images. IEEE Transactions on Pattern Analysis and Machine Intelligence Vol. 36, 5 (May. 2014), 970--983.
[9]
Xu-Cheng Yin, Ze-Yu Zuo, Shu Tian, and Cheng-Lin Liu. 2016. Text Detection, Tracking and Recognition in Video: A Comprehensive Survey. IEEE Transactions on Image Processing Vol. 25, 6 (June. 2016), 2752--2773.
[10]
Zheng Zhang, Wei Shen, Cong Yao, and Xiang Bai. 2015. Symmetry-based text line detection in natural scenes 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2558--2567.
[11]
Xu Zhao, Kai-Hsiang Lin, Yun Fu, Yuxiao Hu, Yuncai Liu, and Thomas S. Huang. 2011. Text From Corners: A Novel Approach to Detect Text and Caption in Videos. IEEE Transactions on Image Processing Vol. 20, 3 (March. 2011), 790--799.
[12]
Ze-Yu Zuo, Shu Tian, Wei yi Pei, and Xu-Cheng Yin. 2015. Multi-strategy tracking based text detection in scene videos 2015 13th International Conference on Document Analysis and Recognition (ICDAR). 66--70.

Cited By

View all

Index Terms

  1. Scene Text Detection and Tracking in Video with Background Cues

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMR '18: Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval
      June 2018
      550 pages
      ISBN:9781450350464
      DOI:10.1145/3206025
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 June 2018

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. background
      2. scene text
      3. text detection
      4. tracking
      5. video

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      ICMR '18
      Sponsor:

      Acceptance Rates

      ICMR '18 Paper Acceptance Rate 44 of 136 submissions, 32%;
      Overall Acceptance Rate 254 of 830 submissions, 31%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)15
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 26 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media