Natural interaction with a virtual guide in a virtual environment

Hofs, Dennis; Theune, Mariët; op den Akker, Rieks

doi:10.1007/s12193-009-0024-6

Natural interaction with a virtual guide in a virtual environment

A multimodal dialogue system

Original Paper
Open access
Published: 05 December 2009

Volume 3, pages 141–153, (2010)
Cite this article

Download PDF

You have full access to this open access article

Journal on Multimodal User Interfaces Aims and scope Submit manuscript

Natural interaction with a virtual guide in a virtual environment

Download PDF

Dennis Hofs¹,
Mariët Theune¹ &
Rieks op den Akker¹

821 Accesses
8 Citations
Explore all metrics

Abstract

This paper describes the Virtual Guide, a multimodal dialogue system represented by an embodied conversational agent that can help users to find their way in a virtual environment, while adapting its affective linguistic style to that of the user. We discuss the modular architecture of the system, and describe the entire loop from multimodal input analysis to multimodal output generation. We also describe how the Virtual Guide detects the level of politeness of the user’s utterances in real-time during the dialogue and aligns its own language to that of the user, using different politeness strategies. Finally we report on our first user tests, and discuss some potential extensions to improve the system.

Article PDF

Cross Modal Evaluation of High Quality Emotional Speech Synthesis with the Virtual Human Toolkit

KRISTINA: A Knowledge-Based Virtual Conversation Agent

Cognitive Planning for Persuasive Multimodal Interaction

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Allen J, Core M (1997) Draft of DAMSL: Dialog Act Markup in Several Layers. Tech. rep., University of Rochester
André E, Rehm M, Minker W, Buhler D (2004) Endowing spoken language dialogue systems with emotional intelligence. In: Affective dialogue systems. LNCS, vol 3068, pp 178–187
Bateman J, Paris C (2005) Adaptation to affective factors: architectural impacts for natural language generation and dialogue. In: Proceedings of the workshop on adapting the interaction style to affective factors at the 10th international conference on user modeling (UM-05)
Bernsen N, Dybkjær L (2004) Managing domain-oriented spoken conversation. In: Proceedings of the AAMAS 2004 workshop on embodied conversational agents: balanced perception and action, pp 9–17
Bickmore T, Caruso L, Clough-Gorr K, Heeren T (2005) ‘It’s just like you talk to a friend’—relational agents for older adults. Interact Comput 17(6):711–735
Article Google Scholar
Black W, Thompson P, Funk A, Conroy A (2003) Learning to classify utterances in a task-oriented dialogue. In: Proceedings of the 2003 EACL workshop on dialogue systems: interaction, adaptation and styles of management, pp 9–16
Boves L, Neumann A, Vuurpijl L, ten Bosch L, Rossignol S, Engel R, Pfleger N (2004) Multimodal interaction in architectural design applications. In: Proceedings UI4ALL 2004: 8th ERCIM workshop on “user interfaces for all”, pp 384–390
Brown P, Levinson SC (1987) Politeness—some universals in language usage. Cambridge University Press, Cambridge
Google Scholar
Buschmeier H, Bergmann K, Kopp S (2009) An alignment-capable microplanner for natural language generation. In: Proceedings of the twelfth European workshop on natural language generation (ENLG 2009), pp 82–89
Cassell J, Bickmore T (2003) Negotiated collusion: modeling social language and its relationship effects in intelligent agents. User Model User-Adapt Interact 13(1–2):89–132
Article Google Scholar
Cassell J, Vilhjálmsson H, Bickmore T (2001) BEAT: the Behavior Expression Animation Toolkit. In: Proceedings of SIGGRAPH ’01, pp 477–486
Catizone R, Setzer A, Wilks Y (2003) Multimodal dialogue management in the COMIC project. In: Proceedings of the 2003 EACL workshop on dialogue systems: interaction, adaptation and styles of management, pp 25–34
Cheyer A, Martin D (2001) The open agent architecture. J Auton Agents Multi-Agent Syst 4(1):143–148
Article Google Scholar
Clark HH (1996) Using language. Cambridge University Press, Cambridge
Book Google Scholar
Dale R, Reiter E (1995) Computational interpretation of the Gricean maxims in the generation of referring expressions. Cogn Sci 19(2):233–263
Article Google Scholar
van Dijk B, op den Akker R, Nijholt A, Zwiers J (2003) Navigation assistance in virtual worlds. Inf Sci 6:115–125. Special series on community informatics
Google Scholar
Evers M, Nijholt A (2000) Jacob—an animated instruction agent for virtual reality. In: Tan T et al. (eds), Advances in multimodal interfaces—ICMI 2000. LNCS, vol 1948. Springer, Berlin, pp 526–533
Chapter Google Scholar
Guinn C, Hubal R (2003) Extracting emotional information from the text of spoken dialog. In: Proceedings of the 9th international conference on user modeling, pp 23–27
Gupta S, Walker MA, Romano DM (2007) Generating politeness in task based interaction: an evaluation of the effect of linguistic form and culture. In: Proceedings of the eleventh European workshop on natural language generation (ENLG-07), pp 57–64
Gupta S, Walker MA, Romano DM (2008) POLLy: a conversational system that uses a shared, representation to generate action and social language. In: Proceedings of IJCNLP 2008, the third international joint conference on natural language processing, pp 967–972
Isard A, Brockmann C, Oberlander J (2006) Individuality and alignment in generated dialogues. In: Proceedings of the 4th international conference on natural language generation (INLG-06), pp 22–29
Janarthanam S, Lemon O (2009) Learning lexical alignment policies for generating referring expressions for spoken dialogue systems. In: Proceedings of the twelfth European workshop on natural language generation (ENLG 2009), pp 74–81
de Jong M, Theune M, Hofs D (2008) Politeness and alignment in dialogues with a virtual guide. In: Proceedings of the seventh international conference on autonomous agents and multiagent systems (AAMAS 2008), pp 207–214
Keizer S, op den Akker R (2007) Dialogue act recognition under uncertainty using bayesian networks. Nat Lang Eng 13(4):287–316
Article Google Scholar
Kelleher JD, Costello FJ (2009) Applying computational models of spatial prepositions to visually situated dialog. Comput Linguist 35(2):271–306
Article Google Scholar
Kerminen A, Jokinen K (2003) Distributed dialogue management in a blackboard architecture. In: Proceedings of the 2003 EACL workshop on dialogue systems: interaction, adaptation and styles of management, pp 53–60
Kopp S, Tepper P, Striegnitz K, Ferriman K, Cassell J (2007) Trading spaces: how humans and humanoids use speech and gesture to give directions. In: Nishida T (ed) Engineering approaches to conversational informatics. Wiley, New York
Google Scholar
Lappin S, Leass H (1994) An algorithm for pronominal anaphora resolution. Comput Linguist 20(4):535–561
Google Scholar
Lemon O, Bracy A, Gruenstein A, Peters S (2001) The WITAS multi-modal dialogue system I. In: Proceedings EuroSpeech 2001, pp 1559–1562
Neff M, Kipp M, Albrecht I, Seidel HP (2008) Gesture modeling and animation based on a probabilistic re-creation of speaker style. ACM Trans Graph 27(1):1–24
Article Google Scholar
Oviatt S, Cohen P (2000) Multimodal interfaces that process what comes naturally. Commun ACM 43(3):45–53
Article Google Scholar
Pickering MJ, Garrod S (2004) Toward a mechanistic psychology of dialogue. Behav Brain Sci 27:169–226
Google Scholar
Porayska-Pomsta K, Mellish C (2004) Modelling politeness in natural language generation. In: Proceedings of the third international conference on natural language generation (INLG-04). LNAI, vol 3123, pp 141–150
Rehm M, André E (2005) Informing the design of embodied conversational agents by analyzing multimodal politeness behaviors in human-human communication. In: Proceedings of the AISB symposium on conversational informatics for supporting social intelligence and interaction, pp 144–151
Sikkel K, op den Akker R (1993) Predictive head-corner chart parsing. In: IWPT 3, third international workshop on parsing technologies, pp 267–276
Theune M, Hofs D, van Kessel M (2007) The virtual guide: a direction giving embodied conversational agent. In: Proceedings of interspeech 2007, pp 2197–2200
Vismans R (1994) Modal particles in dutch directives: a study in functional grammar. In: IFOTT, Vrije Universiteit, Amsterdam
Walker M, Cahn J, Whittaker S (1997) Improvising linguistic style: social and affective bases for agent personality. In: Proceedings of autonomous agents’97. ACM, New York, pp 96–105
Chapter Google Scholar
Wang N, Johnson WL, Mayer RE, Rizzo P, Shaw E, Collins H (2008) The politeness effect: pedagogical agents and learning outcomes. Int J Human-Comput Stud 66:98–112
Article Google Scholar
Wasinger R, Wahlster W (2006) Multimodal human-environment interaction. In: Aarts E, Encarnação J (eds) True visions: the emergence of ambient intelligence. Springer, Berlin, pp 293–308
Google Scholar
van Welbergen H, Nijholt A, Reidsma D, Zwiers J (2006) Presenting in virtual worlds: towards an architecture for a 3D presenter explaining 2D-presented information. IEEE Intell Syst 21(5):47–53
Article Google Scholar
White M, Caldwell T (1998) EXEMPLARS: a practical, extensible framework for dynamic text generation. In: Proceedings of the ninth international workshop on natural language generation (INLG-98), pp 266–275
Wu L, Oviatt SL, Cohen PR (1999) Multimodal integration—a statistical view. IEEE Trans Multimedia 1(4):334–341
Article Google Scholar

Download references

Author information

Authors and Affiliations

Human Media Interaction, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands
Dennis Hofs, Mariët Theune & Rieks op den Akker

Authors

Dennis Hofs
View author publications
You can also search for this author in PubMed Google Scholar
Mariët Theune
View author publications
You can also search for this author in PubMed Google Scholar
Rieks op den Akker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mariët Theune.

Rights and permissions

Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.

Reprints and permissions

About this article

Cite this article

Hofs, D., Theune, M. & op den Akker, R. Natural interaction with a virtual guide in a virtual environment. J Multimodal User Interfaces 3, 141–153 (2010). https://doi.org/10.1007/s12193-009-0024-6

Download citation

Received: 06 April 2009
Accepted: 12 November 2009
Published: 05 December 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s12193-009-0024-6

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Natural interaction with a virtual guide in a virtual environment

Abstract

Article PDF

Similar content being viewed by others

Cross Modal Evaluation of High Quality Emotional Speech Synthesis with the Virtual Human Toolkit

KRISTINA: A Knowledge-Based Virtual Conversation Agent

Cognitive Planning for Persuasive Multimodal Interaction

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation