skip to main content
10.1145/2432553.2432556acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdarConference Proceedingsconference-collections
research-article

Hindi handwritten word recognition using HMM and symbol tree

Published: 16 December 2012 Publication History

Abstract

The proposed approach performs recognition of online handwritten isolated Hindi words using a combination of HMMs trained on Devanagari symbols and a tree formed by the multiple, possible sequences of recognized symbols.
In general, words in Indic languages are composed of a number of aksharas or syllables, which in turn are formed by groups of consonants and vowel modifiers. Segmentation of aksharas is critical to accurate recognition of both recognition primitives as well as the complete word. Also, recognition in itself is an intricate job. This holistic task of akshara segmentation, symbol identification and subsequent word recognition is targeted in our work. It is handled in an integrated segmentation-recognition framework. By making use of online stroke information for postulating symbol candidates and deriving HOG feature set from their image counterparts, the recognition becomes independent of stroke order and stroke shape variations. Thus, the system is well suited to unconstrained handwriting.
Data for this work is collected from different parts of India where Hindi language is predominantly in use. Symbols extracted from 60,000 words are used to train and test 140 symbol-HMM models. The system is designed to output one or more candidate words to the user, by tracing multiple tree paths (up to leaf nodes) under the condition that the symbol likelihood (confidence score) at every node is above threshold. Tests performed on 10,000 words yield an accuracy of 89%.

References

[1]
Indian Standard - Indian Script Code for Information Interchange (ISCII), Bureau of Indian Standards (BIS), Dec. 1991
[2]
N. Joshi, G. Sita, A. G. Ramakrishnan and S. Madhvanath, "Comparison of elastic matching algorithms for online Tamil handwritten character recognition", Proc. IX International Workshop on Frontiers of Handwritten Recognition (IWFHR 2004), pp. 444--449
[3]
A. Arora and A. M. Namboodiri, "A Hybrid Model for Recognition of Online Handwriting in Indian Scripts", Proc. International Conf. Frontiers in Handwriting Recognition (ICFHR 2010), pp. 433--438, 16--18 Nov. 2010
[4]
T. Hasegawa, H. Yasuda and T. Matsumoto, "Fast discrete HMM algorithm for online handwriting recognition", Proc. of 15th International Conf. on Pattern Recognition (ICPR 2000), vol. 4, pp. 535--53
[5]
S. K. Parui, K. Guin, U. Bhattacharya and B. B. Chaudhuri, "Online handwritten Bangla character recognition using HMM", 19th International Conf. on Pattern Recognition (ICPR 2008), pp. 1--4, 8--11 Dec. 2008
[6]
S. D. Connell, R. M. K. Sinha and A. K. Jain, "Recognition of unconstrained online Devanagari characters", Proc. 15th International Conf. on Pattern Recognition (ICPR 2000), vol. 2, pp. 368--371
[7]
A. Kumar and S. Bhattacharya, "Online Devanagari isolated character recognition for the iPhone using Hidden Markov Models", Students' Technology Symposium (TechSym 2010), pp. 300--304, 3--4 April 2010
[8]
Venkatesh Narasimha Murthy and A. G. Ramakrishnan, "Choice of Classifiers in Hierarchical Recognition of Online Handwritten Kannada and Tamil Aksharas," Journal of Universal Computer Science, Vol. 17, pp. 94--106, 2011
[9]
Amrik Sen, G. Ananthakrishnan, Suresh Sundaram and A. G. Ramakrishnan, "Dynamic Space Warping of Strokes for Recognition of Online Handwritten Characters," IJPRAI 23(5): 925--943, 2009
[10]
Rituraj Kunwar and A. G. Ramakrishnan, "Online handwriting recognition of Tamil script using fractal geometry," Proc. 11th International Conference on Document Analysis and Recognition (ICDAR 2011), pp. 1389--1393, Beijing, China, 2011.
[11]
M. Mahadeva Prasad, M. Sukumar, A. G. Ramakrishnan, "Orthogonal LDA in PCA Transformed Subspace," Proc. 12th International Conference on Frontiers in Handwriting Recognition (ICFHR 2010), Nov 2010
[12]
Niranjan Joshi, G. Sita, A. G. Ramakrishnan, Deepu. V and Sriganesh Madhvanath, "Machine Recognition of Online Handwritten Devanagari Characters," Proc. ICDAR 2005, Seoul, Korea, pp. 1156--1160, 29 Aug. - 1 Sept 2005
[13]
Rituraj Kunwar, Shashi Kiran and A. G. Ramakrishnan, "Online handwritten Kannada word recognizer with unrestricted vocabulary," Proc. 12th International Conference on Frontiers in Handwriting Recognition, ICFHR'10
[14]
A. Bharath and Sriganesh Madhvanath, "HMM-Based Lexicon-Driven and Lexicon-Free Word Recognition for Online Handwritten Indic Scripts", IEEE Trans. Pattern Anal. Machine Intelligence, Vol. 34, No. 4, April 2012
[15]
M. Mahadeva Prasad, M. Sukumar and A. G. Ramakrishnan, "Divide and conquer technique in online handwritten Kannada character recognition," ACM - Proceedings of the International Workshop on Multilingual OCR, 2009
[16]
Suresh Sundaram and A. G. Ramakrishnan, "Lexicon-free novel segmentation of online handwritten Indic words," Proc. 11 th International Conference on Document Analysis and Recognition (ICDAR 2011), Beijing, China, pp. 1175--1179, 2011
[17]
B. Zhu, X. D. Zhou and C. L. Liu, M. Nagakawa, "A robust model for online handwritten Japanese text recognition", IJDAR'10, pp. 121--131
[18]
A. Bharath and S. Madhvanath, "Hidden Markov Models for Online Handwritten Tamil Word Recognition", Proc. ICDAR'07, pp. 506--510
[19]
Shashi Kiran, Kolli Sai Prasada, Rituraj Kunwar, A. G. Ramakrishnan, "Comparison of HMM and SDTW for Tamil Handwritten Character Recognition," IEEE International Conference on Signal Processing and Communications (SPCOM 2010), pp. 1--4, Bangalore, 18--21 July 2010
[20]
R. J. Kannan, R. Prabhakar and R. M. Suresh, "Off-line Cursive Handwritten Tamil Character Recognition", International Conf. on Security Technology (SECTECH 2008), pp. 159--164, 13--15 Dec. 2008
[21]
B. Shaw, S. K. Parui and M. Shridhar, "Offline Handwritten Devanagari Word Recognition: A Holistic Approach Based on Directional Chain Code Feature and HMM", International Conf. on Info. Tech. (ICIT 2008), pp. 203--208, 17--20 Dec. 2008
[22]
G. Siva Reddy, Puspanjali Sharma, S. R. M. Prasanna, C. Mahanta and L. N. Sharma, "Combined Online and Offline Assamese Handwritten Numeral Recognizer", National Conf. on Communications (NCC 2012), pp. 1--5, 3--5 Feb. 2012
[23]
Rakesh Rampalli and A. G. Ramakrishnan, "Fusion of Complementary Online and Offline Strategies for Recognition of Handwritten Kannada Characters," Journal of Universal Computer Science, Vol. 17, pp. 81--93, 2011
[24]
Teng Long and Lian-Wen Jin, "Hybrid Recognition for One Stroke Style Cursive Handwriting Characters", Eighth International Conf. on Doc. Anal, and Recog. (ICDAR 2005), pp. 232--236
[25]
S. Belhe, C. Paulzagade, S. Surve, N. Jawanjal, K. Mehrotra and A. Motwani, "Annotation Tool and XML Representation for Online Indic Data", International Conf. on Frontiers in Handwriting Recognition (ICFHR 2010), pp. 664--669, 16--18 Nov. 2010
[26]
S. Belhe, S. Chakravarthy and A. G. Ramakrishnan, "XML standard for Indic online handwritten database", Proc. International Workshop on Multilingual OCR (MOCR 2009), Article 19, 4 pages
[27]
B. Nethravathi, C. P. Archana, K. Shashikiran, A. G. Ramakrishnan and V. Kumar, "Creation of a Huge Annotated Database for Tamil and Kannada OHR," Frontiers in Handwriting Recognition (ICFHR), 2010 International Conference on, pp. 415--420, 16--18 Nov. 2010
[28]
Subhransu Maji and Jitendra Malik, "Fast and Accurate Digit Classification", EECS Department, University of California - Berkeley, Technical Report No. UCB/EECS-2009-159, 25 Nov. 2009
[29]
N. Dalai and B. Triggs, "Histograms of oriented gradients for human detection", IEEE Computer Society Conf. on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886--893, 25 June 2005
[30]
L. R. Rabiner and B. H. Juang: An Introduction to Hidden Markov Models. IEEE ASSP Magazine, pp. 4--16, 1986
[31]
Jitendra Kumar, VS. Chakravarthy, "Designing an optimal Classifier Ensemble for online character recognition using Genetic Algorithms", 11th International Conf. on Frontiers in Handwriting Recognition, Montreal, Canada, 19--21 August 2008

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
DAR '12: Proceeding of the workshop on Document Analysis and Recognition
December 2012
162 pages
ISBN:9781450317979
DOI:10.1145/2432553
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 December 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HMM
  2. HOG features
  3. Hindi handwritten word recognition
  4. symbol tree

Qualifiers

  • Research-article

Conference

DAR '12

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media