skip to main content
10.1109/ROMAN.2018.8525621guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article

Generation of Gestures During Presentation for Humanoid Robots

Published: 27 August 2018 Publication History

Abstract

For presentation purposes, gestures play an exceptionally important role in improving the information transmission effect. It has been demonstrated that the body language expressing the enthusiasm and intention of the presenter affects the success of the presentation and the impression on the audience. For these reasons, presentation robots are required to perform such movements; however, manual design of these movements is a difficult task. In this research, we propose a method to model the relationship between speech prosodic information and motion using a recurrent neural network, and directly generate appropriate motions using the prosodic information. This study also proposes a method for generating motions that convey the meaning of specific words. We implement the proposed method on the “Pepper” robot to evaluate its performance.

References

[1]
S. Kita, “Feature articles, cognitive science of gestures: Why do people do gestures.” Cognitive science 7(1), 9–21. 2000.
[2]
K. W. Berger and G. R. Popelka, “Extra-facial gestures in relation to speechreading,” Journal of Communication Disorders pp. 302–308, 1971.
[3]
P. Bremner and U. Leonards, “Iconic gestures for robot avatars, recognition and integration with speech,” Frontiers in Psychology, vol. 7, 2016. [Online]. Available: https://www.frontiersin.org/article/10.3389/fpsyg.2016.00183.
[4]
M. Gentilucci and R. D. Volta, “Spoken language and arm gestures are controlled by the same motor control system,” Quarterly Journal of Experimental Psychology, vol. 61, no. 6, pp. 944–957, 2008. 18470824. [Online]. Available: https://doi.org/10.1080/17470210701625683.
[5]
A. Pentland, “Honest signals,” The MIT press, 2008.
[6]
D. McNeill, Hand and mind: what gestures reveal about thought / David McNeill University of Chicago; Press Chicago. 1992.
[7]
C.-M. Huang and B. Mutlu, “Modeling and evaluating narrative gestures for humanlike robots,” in Robotics: Science and Systems, 2013.
[8]
J. Cassell, H. H. Vilhjálmsson, and T. Bickmore, “Beat: The behavior expression animation toolkit,” in Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques, ser. SIGGRAPH ‘01. New York, NY, USA: ACM, 2001, pp. 477–486. [Online]. Available: http://doi.acm.org/10.1145/383259.383315.
[9]
S. Kopp, P. Tepper, K. Striegnitz, K. Ferriman, and J. Cassell, “Trading spaces: How humans and humanoids use speech and gesture to give directions,” in Conversational Informatics: an Engineering Approach, T. Nishida, Ed. John Wiley & Sons, 2007, pp. 133–160.
[10]
V. Ng-Thow-Hing, P. Luo, and S. Y. Okita, “Synchronized gesture and speech production for humanoid robots,” 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4617–4624, 2010.
[11]
O. Alemi, W. Li, and P. Pasquier, “Affect-expressive movement generation with factored conditional restricted boltzmann machines,” ACII.2015.
[12]
C.-C. Chiu and S. Marsella, “How to train your avatar: A data driven approach to gesture generation,” in Intelligent Virtual Agents, H. H. Vilhjálmsson, S. Kopp, S. Marsella, and K. R. Thórisson, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg. 2011. pp. 127–140.
[13]
C.-C. Chiu, L.-P. Morency, and S. Marsella, “Predicting co-verbal gestures: A deep and temporal modeling approach,” in Intelligent Virtual Agents, W.-P. Brinkman, J. Broekens, and D. Heylen, Eds. Cham: Springer International Publishing, 2015, pp. 152–166.
[14]
K. Takeuchi, H. Hasegawa, S. Shirakawa, N. Kaneko, H. Sakuta, and K. Sumi, “An approchi in speech-to-gesture generation with bidirectional lstm,” HAI. 2017.
[15]
Z. Cao, T. Simon, S.-E. Wei, and Y. Sheikh, “Realtime multi-person 2d pose estimation using affinity fields,” CVPR, 2017.
[16]
X. Zhou, M. Zhu, G. Pavlakos, S. Leonardos, K. G. Derpanis, and K. Daniilidis, “Monocap: Monocular human motion capture using a cnn coupled with a geometric prior,” CVPR, 2016.
[17]
M. Morise, F. Yokomori, and K. Ozawa, “World: a vocoder-based high-quality speech synthesis system for real-time applications,” IE-ICE transactions on information and systems, 2016.
[18]
I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” NIPS, 2014.
[19]
TED Conferences LLC, “Ted (technology entertainment design),” https://www.ted.com/, 2017.

Cited By

View all

Index Terms

  1. Generation of Gestures During Presentation for Humanoid Robots
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN)
      Aug 2018
      1195 pages

      Publisher

      IEEE Press

      Publication History

      Published: 27 August 2018

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 31 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media