skip to main content
10.1145/3488933.3488935acmotherconferencesArticle/Chapter ViewAbstractPublication PagesaiprConference Proceedingsconference-collections
research-article

Text Recognition in UAV Aerial Images

Published: 25 February 2022 Publication History

Abstract

Text recognition in unmanned aerial vehicle (UAV) aerial images is an important branch in the field of machine intelligence, which can provide important discriminative information for subsequent applications. At this stage, text recognition methods have made breakthrough progress, but the recognition of distorted and slanted text is still a challenge. In this case, we construct a text recognition network model with correction module, and propose a new type of UAV aerial image text recognition method. Specifically, the model mainly includes two parts: rectification network and recognition network. The rectification network can be optimized without manual annotation, and it can regularize various distorted and inclined UAV image texts. The recognition network introduces the attention mechanism and improves the decoder to perform bidirectional recognition of the rectified UAV image text. In addition, we verify the effectiveness of the rectification network through a large number of experiments, and prove that the model composed of the rectification network and the recognition network can achieve the optimal recognition performance.

References

[1]
Z. Qiao, Y. Zhou, D. Yang, Y. Zhou and W. Wang, "SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 13525-13534. https://doi.org/10.1109/CVPR42600.2020.01354.
[2]
M. Liao, B. Shi and X. Bai, "TextBoxes++: A Single-Shot Oriented Scene Text Detector," in IEEE Transactions on Image Processing, vol. 27, no. 8, pp. 3676-3690, Aug. 2018.
[3]
X. Liu, D. Liang, S. Yan, D. Chen, Y. Qiao and J. Yan, "FOTS: Fast Oriented Text Spotting with a Unified Network," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 5676-5685.
[4]
B. Shi, X. Bai and C. Yao, "An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2298-2304, 1 Nov. 2017.
[5]
C. Bartz, H. Yang, and C. Meinel, “STN-OCR: A single neural network for text detection and text recognition,” arXiv preprint arXiv:1707.08831, 2017.
[6]
M. Jaderberg, K. Simonyan, A. Zisserman and K. Kavukcuoglu, "Spatial transformer networks", Proc. Int. Conf. Neural Inf. Process. Syst., pp. 2017-2025, 2015.
[7]
X. Zhou, "EAST: An Efficient and Accurate Scene Text Detector," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 2642-2651.
[8]
M. Bušta, L. Neumann and J. Matas, "Deep TextSpotter: An End-to-End Trainable Scene Text Localization and Recognition Framework," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2223-2231.
[9]
Z. Zhang, C. Zhang, W. Shen, C. Yao, W. Liu and X. Bai, "Multi-oriented text detection with fully convolutional networks", CVPR, pp. 4159-4167, 2016.
[10]
F. L. Bookstein, "Principal warps: Thin-plate splines and the decomposition of deformations", IEEE Trans. Pattern Anal. Mach. Intell., vol. 11, no. 6, pp. 567-585, Jun. 1989.
[11]
L. Gao, "ICDAR 2019 Competition on Table Detection and Recognition (cTDaR)," 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, Australia, 2019, pp. 1510-1515.
[12]
B. Shi, M. Yang, X. Wang, P. Lyu, C. Yao and X. Bai, "ASTER: An Attentional Scene Text Recognizer with Flexible Rectification," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 41, no. 9, pp. 2035-2048, 1 Sept. 2019.
[13]
B. Shi, X. Bai and S. Belongie, "Detecting Oriented Text in Natural Images by Linking Segments," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 3482-3490.
[14]
Anand Mishra, Karteek Alahari, and CV Jawahar. Scene text recognition using higher order language priors. In BMVC, pages 1–11, 2012.
[15]
J. Ma, "Arbitrary-Oriented Scene Text Detection via Rotation Proposals," in IEEE Transactions on Multimedia, vol. 20, no. 11, pp. 3111-3122, Nov. 2018.
[16]
Dianwei Wang, Jing Zhai, Pengfei Han, Jing Jiang, Xincheng Ren, Yongrui Qin, and Zhijie Xu. 2020. A No-reference Image Quality Assessment Method for Real Foggy Images. In Proceedings of the 2020 3rd International Conference on Artificial Intelligence and Pattern Recognition (AIPR 2020). Association for Computing Machinery, New York, NY, USA, 120–125.
[17]
Y. Liu, H. Chen, C. Shen, T. He, L. Jin and L. Wang, "ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9806-9815.
[18]
S. -X. Zhang, "Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection," 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 2020, pp. 9696-9705.
[19]
Y. Liu and L. Jin, "Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 2017, pp. 3454-3461.
[20]
Y. Zhang, S. Nie, S. Liang and W. Liu, "Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation," in IEEE Transactions on Image Processing, vol. 30, pp. 3922-3933, 2021.

Cited By

View all

Index Terms

  1. Text Recognition in UAV Aerial Images
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    AIPR '21: Proceedings of the 2021 4th International Conference on Artificial Intelligence and Pattern Recognition
    September 2021
    715 pages
    ISBN:9781450384087
    DOI:10.1145/3488933
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 February 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Aerial images
    2. Rectification network
    3. Text recognition
    4. UAV

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    AIPR 2021

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 22 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media