skip to main content
10.1145/3380688.3380711acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlscConference Proceedingsconference-collections
research-article

Video-based Skeletal Feature Extraction for Hand Gesture Recognition

Published: 07 March 2020 Publication History

Abstract

Hand gesture recognition is a hot topic and a central key for different types of application. As applications of computers and intelligent systems are growing in our daily life, facilitating natural human computer interaction becomes more important. In this paper, we focus on video-based approach on hand gesture recognition integrated with 3-D hand skeletal features to construct the raw video sequences, retaining the key video frames, extracting spatial temporal data and feeding them into a Support Vector Machine model for 2-D hand sign classification. Our novel method integrates hand skeletal descriptor into video sequence to retain the spatial temporal information which will be extracted as vectors for classification task. As oppose to conventional method of requiring a well placed pair of cameras or depth detection hardware, our method only require only one camera. The proposed approach outperforms state-of-the-art static hand gesture recognition methods, achieving almost 100% accuracy among 24 classes.

References

[1]
Sansanee Auephanwiriyakul, Suwannee Phitakwinai, Wattanapong Suttapak, Phonkrit Chanda, and Nipon Theera-Umpon. 2013. Thai sign language translation using scale invariant feature transform and hidden Markov models. Pattern Recognition Letters 34, 11 (2013), 1291--1298.
[2]
Feng-Sheng Chen, Chih-Ming Fu, and Chung-Lin Huang. 2003. Hand gesture recognition using a real-time tracking method and hidden Markov models. Image and Vision Computing 21, 8 (2003), 745--758.
[3]
Xuan Chen, Binh P. Nguyen, Chee-Kong Chui, and Sim-Heng Ong. 2016. Automated brain tumor segmentation using kernel dictionary learning and superpixellevel features. In Proceedings of the International Conference on Systems, Man, and Cybernetics (SMC 2016). IEEE, Budapesh, Hungary, 2547--2552.
[4]
Xuan Chen, Binh P. Nguyen, Chee-Kong Chui, and Sim-Heng Ong. 2017. An automatic framework for multi-label brain tumor segmentation based on kernel sparse representation. Acta Polytechnica Hungarica 14, 1 (Apr 2017), 25--43. https://doi.org/10.12700/APH.14.1.2017.1.3
[5]
Xuan Chen, Binh P. Nguyen, Chee-Kong Chui, and Sim-Heng Ong. 2017. Reworking multilabel brain tumor segmentation -- An automated framework using structured kernel sparse representation. IEEE Systems, Man, and Cybernetics Magazine 3, 2 (Apr 2017), 18--22. https://doi.org/10.1109/MSMC.2017.2664158
[6]
Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, Rohan Anil, Zakaria Haque, Lichan Hong, Vihan Jain, Xiaobing Liu, and Hemal Shah. 2016. Wide & deep learning for recommender systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems (DLRS 2016). ACM, 7--10. https://doi.org/10.1145/2988450.2988454
[7]
Quentin De Smedt, Hazem Wannous, and Jean-Philippe Vandeborre. 2016. Skeleton-based dynamic hand gesture recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 1--9.
[8]
Quentin De Smedt, Hazem Wannous, and Jean-Philippe Vandeborre. 2019. Heterogeneous hand gesture recognition using 3D dynamic skeletal data. Computer Vision and Image Understanding 181 (2019), 60--72.
[9]
Rudy Hartanto, Adhi Susanto, and P Insap Santosa. 2014. Real time hand gesture movements tracking and recognizing system. In Proceedings of the Electrical Power, Electronics, Communicatons, Control and Informatics Seminar (EECCIS 2014). IEEE, 137--141.
[10]
Reza Hassanpour and Asadollah Shahbahrami. 2009. Human computer interaction using vision-based hand gesture recognition. Journal of Computer Engineering 1 (2009), 3--11.
[11]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[12]
Chung-Yang Hsieh and Wei-Yang Lin. 2017. Video-based human action and hand gesture recognition by fusing factored matrices of dual tensors. Multimedia Tools and Applications 76, 6 (2017), 7575--7594.
[13]
Zhongxu Hu, Youmin Hu, Jie Liu, Bo Wu, Dongmin Han, and Thomas Kurfess. 2018. 3D separable convolutional neural network for dynamic hand gesture recognition. Neurocomputing 318 (2018), 151--161.
[14]
Philumon Joseph et al. 2017. Recent Trends and Technologies in Hand Gesture Recognition. International Journal of Advanced Research in Computer Science 8, 5 (2017).
[15]
Ioannis Kapsouras and Nikos Nikolaidis. 2019. Action recognition by fusing depth video and skeletal data information. Multimedia Tools and Applications 78, 2 (2019), 1971-1998.
[16]
Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei. 2014. Large-scale video classification with convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1725--1732.
[17]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097--1105.
[18]
Kenneth Lai and Svetlana N Yanushkevich. 2018. CNN+ RNN depth and skeleton based dynamic hand gesture recognition. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR 2018). IEEE, 3451--3456.
[19]
Jinghua Li, Huarui Huai, Junbin Gao, Dehui Kong, and Lichun Wang. 2019. Spatial-temporal dynamic hand gesture recognition via hybrid deep learning model. Journal on Multimodal User Interfaces 13, 4 (01 Dec 2019), 363--371. https://doi.org/10.1007/s12193-019-00304-z
[20]
Yuan Li, XinggangWang, Wenyu Liu, and Bin Feng. 2018. Deep attention network for joint hand gesture localization and recognition using static RGB-D images. Information Sciences 441 (2018), 66--78.
[21]
Chung-Ju Liao, Shun-Feng Su, and Ming-Chang Chen. 2015. Vision-based hand gesture recognition system for a dynamic and complicated environment. In Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics (SMC 2015). IEEE, 2891--2895.
[22]
Binh P. Nguyen, Quang H. Nguyen, Giang-Nam Doan-Ngoc, Thanh-Hoang Nguyen-Vo, and Susanto Rahardja. 2019. iProDNA-CapsNet: Identifying Protein- DNA binding residues using capsule neural networks. BMC Bioinformatics 20, Suppl 23 (Dec 2019), 1--12. https://doi.org/10.1186/s12859-019-3295-2
[23]
Binh P. Nguyen, Hung N. Pham, Hop Tran, Nhung Nghiem, Quang H. Nguyen, Trang T.T. Do, Cao Truong Tran, and Colin R. Simpson. 2019. Predicting the onset of type 2 diabetes using wide and deep learning with electronic health records. Computer Methods and Programs in Biomedicine 182 (Dec 2019), 105055. https://doi.org/10.1016/j.cmpb.2019.105055
[24]
Binh P. Nguyen, Wei-Liang Tay, and Chee-Kong Chui. 2015. Robust biometric recognition from palm depth images for gloved hands. IEEE Transactions on Human-Machine Systems 45, 6 (Dec 2015), 799--804. https://doi.org/10.1109/ THMS.2015.2453203
[25]
Oyebade K Oyedotun and Adnan Khashman. 2017. Deep learning in vision-based static hand gesture recognition. Neural Computing and Applications 28, 12 (2017), 3941--3951.
[26]
Prashan Premaratne. 2014. Human computer interaction using hand gestures. Springer Science and Business Media.
[27]
Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. 2017. Dynamic routing between capsules. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 3856--3866.
[28]
Ling Shao and Ling Ji. 2009. Motion histogram analysis based key frame extraction for human action/activity representation. In Proceedings of the Canadian Conference on Computer and Robot Vision. IEEE, 88--92.
[29]
Joyeeta Singha, Amarjit Roy, and Rabul Hussain Laskar. 2018. Dynamic hand gesture recognition using vision-based approach for human--computer interaction. Neural Computing and Applications 29, 4 (2018), 1129--1141.
[30]
Jesus Suarez and Robin R Murphy. 2012. Hand gesture recognition with depth images: A review. In Proceedings of the 21st IEEE International Symposium on Robot and Human Interactive Communication (ROMAN 2012). IEEE, 411--417.
[31]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1--9.
[32]
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818--2826.
[33]
Hao Tang, Hong Liu, Wei Xiao, and Nicu Sebe. 2019. Fast and robust dynamic hand gesture recognition via key frames extraction and feature fusion. Neurocomputing 331 (2019), 424--433.
[34]
Vo Hoai Viet, Nguyen Thanh Thien Phuc, Pham Minh Hoang, and Liu Kim Nghia. 2018. Spatial-temporal shape and motion features for dynamic hand gesture recognition in depth video. International Journal of Image, Graphics & Signal Processing 10, 9 (2018).
[35]
RongWen, Binh P. Nguyen, Chin-Boon Chng, and Chee-Kong Chui. 2013. In situ spatial AR surgical planning using projector-Kinect system. In Proceedings of the Symposium on Information and Communication Technology (SoICT 2013) (ICPS), Vol. 836. ACM, Danang, Vietnam, 164--171.
[36]
Rong Wen, Wei-Liang Tay, Binh P. Nguyen, Chin-Boon Chng, and Chee-Kong Chui. 2014. Hand gesture guided robot-assisted surgery based on a direct augmented reality interface. Computer Methods and Programs in Biomedicine 116, 2 (Sep 2014), 68--80. https://doi.org/10.1016/j.cmpb.2013.12.018
[37]
Meng Xing, Jing Hu, Zhiyong Feng, Yong Su, Weilong Peng, and Jinqing Zheng. 2019. Dynamic hand gesture recognition using motion pattern and shape descriptors. Multimedia Tools and Applications 78, 8 (2019), 10649--10672.
[38]
Bingyuan Xu, Zhiheng Zhou, Junchu Huang, and Yu Huang. 2017. Static hand gesture recognition based on RGB-D image and arm removal. In International Symposium on Neural Networks. Springer, 180--187.
[39]
Qiang Zhang, Yong Zhang, and Zhiguo Liu. 2019. A dynamic hand gesture recognition algorithm based on CSI and YOLOv3. In Journal of Physics: Conference Series, Vol. 1267. IOP Publishing, 012055.
[40]
Zhenyuan Zhang, Zengshan Tian, and Mu Zhou. 2018. HandSense: smart multimodal hand gesture recognition based on deep neural networks. Journal of Ambient Intelligence and Humanized Computing (2018), 1--16.
[41]
Jinqing Zheng, Zhiyong Feng, Chao Xu, Jing Hu, and Weimin Ge. 2017. Fusing shape and spatio-temporal features for depth-based dynamic hand gesture recognition. Multimedia Tools and Applications 76, 20 (2017), 20525--20544.

Cited By

View all

Index Terms

  1. Video-based Skeletal Feature Extraction for Hand Gesture Recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICMLSC '20: Proceedings of the 4th International Conference on Machine Learning and Soft Computing
    January 2020
    175 pages
    ISBN:9781450376310
    DOI:10.1145/3380688
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • NICT: National Institute of Information and Communications Technology

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 March 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Classification
    2. Dynamic Hand Gesture Recognition
    3. Skeletal Data
    4. Static Hand Gesture Recognition
    5. Support Vector Machine (SVM)

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • Tote Board Enabling Lives Initiative Grant and supported by SG Enable.

    Conference

    ICMLSC 2020

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)24
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media