skip to main content
research-article

Language model based suggestions of next possible Gurmukhi character or word in online handwriting recognition system

Published: 09 May 2023 Publication History

Abstract

In general, the prediction models are increasingly being used for reasoning and decision making in various applications. With the advancements in IT based devices such as Tablet-PC, touch-screen based smart phones, digital-pen/stylus based devices, and digitizers, the demand of real-time based applications is also increasing. The present study describes the Language Model (LM) based forecasting the occurrence of next possible Gurmukhi character/word in a word/sentence, which depends on the immediately preceding character(s)/word(s), written in the real-time environment. The online handwritten captured character/word information is first segmented into its individual strokes, which are recognized using Support Vector Machine (SVM) classifier. Once a character/word is recognized, this will be useful to assist the writers in order to provide the suggestions for next possible character/word. The n-gram language models (bigram and trigram) have been implemented at character- and word-level for this purpose. In this study, the corpus, “Punjabi Monolingual Text Corpus-AnglaMT” (available at https://tdil-dc.in), containing approximately 83,000 sentences has been used for training the model. Experimental results show that the proposed online handwritten character/word forecasting framework significantly outperforms and produce consistent forecasts for the most likely character/word on the basis of given handwritten character/word information and saving computational costs. This model can also be used for many other non-Indic and Indic scripting languages.

References

[1]
Abdul Rahiman M, Shajan A, Elizabeth A, Divya MK, Manoj Kumar G, Rajasree MS (2010) Isolated handwritten malayalam character recognition using HLH intensity patterns. In: 2010 Second international conference on machine learning and computing, IEEE, pp 147–151
[2]
Abou-zeid HMR, El-ghazal AS, Al-khatib AA (2003) Computer recognition of unconstrained handwritten numerals. In: 2003 46Th midwest symposium on circuits and systems, vol 2. IEEE, pp 969–973
[3]
Aparna KH, Subramanian V, Kasirajan M, Vijay Prakash G, Chakravarthy VS, Madhvanath S (2004) Online handwriting recognition for Tamil. In: Frontiers in handwriting recognition, 2004. IWFHR-9 2004. Ninth international workshop on, IEEE, pp 438–44
[4]
Bahri H (1982) Teach yourself Panjabi. Panjabi University
[5]
Belhe S, Paulzagade C, Deshmukh A, Jetley S, Mehrotra K (2012) Hindi handwritten word recognition using hmm and symbol tree. In: Proceeding of the workshop on document analysis and recognition, ACM, pp 9–14
[6]
Bharath A, Madhvanath S (2009) Online handwriting recognition for indic scripts. In: Guide to OCR for indic scripts, Springer, pp 209–234
[7]
Bhattacharya U, Nigam A, Rawat Y S, Parui S K (2008) An analytic scheme for online handwritten Bangla cursive word recognition. In: Proc. of the 11th ICFHR, pp 320–325
[8]
Bunke H, Bengio S, and Vinciarelli A Offline recognition of unconstrained handwritten texts using hmms and statistical language models IEEE Trans Pattern Anal Mach Intell 2004 26 6 709-720
[9]
Dahake D, Sharma RK, Singh H (2017) On segmentation of words from online handwritten Gurmukhi sentences. In: 2017 2Nd international conference on man and machine interfacing (MAMI), IEEE, pages 1–6
[10]
Dehghani A, Shabini F, Nava P (2001) Off-line recognition of isolated persian handwritten characters using multiple hidden markov models. In: Proceedings International Conference on Information Technology: Coding and Computing, IEEE, pp 506–510
[11]
Graves A, Liwicki M, Bunke H, Schmidhuber J, Fernández S (2007) Unconstrained on-line handwriting recognition with recurrent neural networks. In: Advances in neural information processing systems, vol 20, pp 577–584
[12]
Haque Md, Habib Md, Rahman Md, et al. (2016) Automated word prediction in bangla language using stochastic language models. arXiv:1602.07803
[13]
Jagadeesh Kannan R, Prabhakar R, Suresh R M (2008) Off-line cursive handwritten Tamil character recognition. In: 2008 International conference on security technology, IEEE, pp 159–164
[14]
Jelinek F (1990) Self-organized language modeling for speech recognition. Readings in Speech Recognition, Morgan Kaufmann publishers, Inc. 450-506
[15]
Jurafsky D, James H-M (2009) Speech and Language Processing: an Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics, 2nd edn. Prentice-Hall
[16]
Kumar R and Sharma RK An efficient post processing algorithm for online handwriting Gurmukhi character recognition using set theory Int J Pattern Recogn Artif Intell 2013 27 04 1353002
[17]
Li Y-X, Tan C-L (2004) An empirical study of statistical language models for contextual post-processing of chinese script recognition. In: Frontiers in handwriting recognition, 2004. IWFHR-9 2004. Ninth international workshop on, IEEE, pp 257–262
[18]
Marti U-V, Bunke H (1999) A full english sentence database for off-line handwriting recognition. In: Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR’99 (Cat. No. PR00318), IEEE, pp 705–708
[19]
Marti U-V, Bunke H (2004) Unconstrained handwriting recognition: language models, perplexity, and system performance, In EPRINTS-BOOKTITLE.
[20]
Marti U-V and Bunke H Using a statistical language model to improve the performance of an hmm based cursive handwriting recognition system International Journal of Pattern Recognition and Artificial Intelligence 2001 15 01 65-90
[21]
Marukatat S, Artieres T, Gallinari R, Dorizzi B (2001) Sentence recognition through hybrid neuro-markovian modeling. In: Proceedings of Sixth International Conference on Document Analysis and Recognition, IEEE, pp 731–735
[22]
Perraud F, Viard-Gaudin C, Morin E, Lallican P-M (2003) N-gram and n-class models for on line handwriting recognition. In: Null, IEEE, pp 1053
[23]
Quiniou S, Anquetil E (2006) A priori and a posteriori integration and combination of language models in an on-line handwritten sentence recognition system. In: Tenth international workshop on frontiers in handwriting recognitio, SuviSoft
[24]
Quiniou S, Anquetil E, Carbonnel S (2005) Statistical language models for on-line handwritten sentence recognition. In: Document analysis and recognition, 2005. Proceedings. Eighth international conference on, IEEE, pp 516–520
[25]
Sharma D and Jhajj P Recognition of isolated handwritten characters in Gurmukhi script Int J Comput Appl 2010 4 8 9-17
[26]
Sharma A, Kumar R, Sharma RK (2009) Rearrangement of recognized strokes in online handwritten Gurmukhi words recognition. In: Document analysis and recognition, 2009. ICDAR’09. 10th international conference on, IEEE, pp 1241–1245
[27]
Singh H, Sharma RK, and Singh VP Efficient zone identification approach for the recognition of online handwritten Gurmukhi script Neural Comput Applic 2019 31 3957-3968
[28]
Singh H, Sharma RK, and Singh VP Recognition of online unconstrained handwritten Gurmukhi characters based on finite state automata Sādhanā 2018 43 11 192
[29]
Singh H, Sharma RK, Singh VP, and Kumar M Recognition of online handwritten Gurmukhi characters using recurrent neural network classifier Soft Comput 2021 25 6329-6338
[30]
Sundaram S and Ramakrishnan AG Bigram language models and reevaluation strategy for improved recognition of online handwritten Tamil words ACM Trans Asian Low-Res Lang Inf Process 2015 14 2 8
[31]
Wang Q-F, Yin F, Liu C-L (2009) Integrating language model in handwritten chinese text recognition. In: 2009 10Th international conference on document analysis and recognition, IEEE, pp 1036–1040
[32]
Zimmermann M, Bunke H (2004) Optimizing the integration of a statistical language model in hmm based offline handwritten text recognition. In: Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on, vol 2. IEEE, pp 541–544

Index Terms

  1. Language model based suggestions of next possible Gurmukhi character or word in online handwriting recognition system
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Multimedia Tools and Applications
        Multimedia Tools and Applications  Volume 82, Issue 30
        Dec 2023
        1553 pages

        Publisher

        Kluwer Academic Publishers

        United States

        Publication History

        Published: 09 May 2023
        Accepted: 03 February 2023
        Revision received: 09 May 2022
        Received: 03 February 2022

        Author Tags

        1. Online handwriting recognition
        2. Gurmukhi script
        3. SVM classifier
        4. Bigram and trigram language models
        5. Forecasting probabilities

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 0
          Total Downloads
        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 20 Jan 2025

        Other Metrics

        Citations

        View Options

        View options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media