research-article

Open access

ReMap: Lowering the Barrier to Help-Seeking with Multimodal Search

Authors:

C. Ailie Fraser,

Julia M. Markel,

Mira Dontcheva,

Scott KlemmerAuthors Info & Claims

UIST '20: Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology

Pages 979 - 986

https://doi.org/10.1145/3379337.3415592

Published: 20 October 2020 Publication History

Abstract

People often seek help online while using complex software. Currently, information search takes users' attention away from the task at hand by creating a separate search task. This paper investigates how multimodal interaction can make in-task help-seeking easier and faster. We introduce ReMap, a multimodal search interface that helps users find video assistance while using desktop and web applications. Users can speak search queries, add application-specific terms deictically (e.g., "how to erase this"), and navigate search results via speech, all without taking their hands (or mouse) off their current task. Thirteen participants who used ReMap in the lab found that it helped them stay focused on their task while simultaneously searching for and using learning videos. Users' experiences with ReMap also raised a number of important challenges with implementing system-wide context-aware multimodal assistance.

Supplementary Material

VTT File (ufp5388pv.vtt)

Download
.68 KB

VTT File (ufp5388vf.vtt)

Download
3.25 KB

VTT File (3379337.3415592.vtt)

Download
6.34 KB

SRT File (ufp5388pvc.srt)

Preview video captions

Download
.69 KB

SRT File (ufp5388vfc.srt)

Video figure captions

Download
3.36 KB

ZIP File (ufp5388aux.zip)

remap.srt - Subtitle file for the full-length video.

Download
3.10 KB

MP4 File (ufp5388pv.mp4)

Preview video

Download
1.47 MB

MP4 File (ufp5388vf.mp4)

Video figure

Download
9.23 MB

MP4 File (3379337.3415592.mp4)

Presentation Video

Download
21.44 MB

References

[1]

2019. Top Global Consumer Trends in 2020. (2019). https://www.globalwebindex.com/reports/trends-2020

[2]

Richard A. Bolt. 1980. “Put-that-there”: Voice and gesture at the graphics interface. ACM SIGGRAPH Computer Graphics 14, 3 (jul 1980), 262--270. http://dx.doi.org/10.1145/965105.807503

Digital Library

[3]

Horatiu Bota, Adam Fourney, Susan T. Dumais, Tomasz L. Religa, and Robert Rounthwaite. 2018. Characterizing Search Behavior in Productivity Software. In Proceedings of the 2018 Conference on Human Information Interaction & Retrieval - CHIIR '18. ACM Press, New York, NY, USA, 160--169. http://dx.doi.org/10.1145/3176349.3176395

Digital Library

[4]

Joel Brandt, Mira Dontcheva, Marcos Weskamp, and Scott R. Klemmer. 2010. Example-centric programming: Integrating Web Search into the Development Environment. In Proceedings of the 28th international conference on Human factors in computing systems - CHI '10. ACM Press, New York, NY, USA, 513. http://dx.doi.org/10.1145/1753326.1753402

[5]

Minsuk Chang, Anh Truong, Oliver Wang, Maneesh Agrawala, and Juho Kim. 2019. How to Design Voice Based Navigation for How-To Videos. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1--11. http://dx.doi.org/10.1145/3290605.3300931

Digital Library

[6]

Morgan Dixon and James Fogarty. 2010. Prefab: Implementing advanced behaviors using pixel-based reverse engineering of interface structure. In Proceedings of the 28th international conference on Human factors in computing systems - CHI '10. ACM Press, New York, NY, USA, 1525. http://dx.doi.org/10.1145/1753326.1753554

Digital Library

[7]

Michael Ekstrand, Wei Li, Tovi Grossman, Justin Matejka, and George Fitzmaurice. 2011. Searching for software learning resources using application context. In Proceedings of the 24th annual ACM symposium on User interface software and technology - UIST '11. ACM Press, New York, NY, USA, 195. http://dx.doi.org/10.1145/2047196.2047220

Digital Library

[8]

Adam Fourney, Ben Lafreniere, Parmit Chilana, and Michael Terry. 2014. InterTwine: creating interapplication information scent to support coordinated use of software. In Proceedings of the 27th annual ACM symposium on User interface software and technology - UIST '14. ACM Press, New York, NY, USA, 429--438. http://dx.doi.org/10.1145/2642918.2647420

Digital Library

[9]

C. Ailie Fraser, Tricia J. Ngoon, Mira Dontcheva, and Scott Klemmer. 2019. RePlay: Contextually Presenting Learning Videos Across Software Applications. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1--13. http://dx.doi.org/10.1145/3290605.3300527

Digital Library

[10]

Masaaki Fukumoto. 2018. SilentVoice: Unnoticeable voice input by ingressive speech. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology - UIST '18. ACM Press, New York, NY, USA, 237--246. http://dx.doi.org/10.1145/3242587.3242603

Digital Library

[11]

Tovi Grossman and George Fitzmaurice. 2010. ToolClips: An Investigation of Contextual Video Assistance for Functionality Understanding. In Proceedings of the 28th international conference on Human factors in computing systems - CHI '10. ACM Press, New York, NY, USA, 1515. http://dx.doi.org/10.1145/1753326.1753552

Digital Library

[12]

Ido Guy. 2018. The Characteristics of Voice Search: Comparing Spoken with Typed-in Mobile Web Search Queries. ACM Transactions on Information Systems 36, 3 (apr 2018), 1--28. http://dx.doi.org/10.1145/3182163

Digital Library

[13]

Amy Hurst, Scott E. Hudson, and Jennifer Mankoff. 2010. Automatically identifying targets users interact with during real world tasks. In Proceedings of the 15th international conference on Intelligent user interfaces - IUI '10. ACM Press, New York, NY, USA, 11. http://dx.doi.org/10.1145/1719970.1719973

Digital Library

[14]

Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13. ACM Press, New York, NY, USA, 143. http://dx.doi.org/10.1145/2484028.2484092

Digital Library

[15]

Slava Kalyuga, Paul Chandler, and John Sweller. 1999. Managing split-attention and redundancy in multimedia instruction. Applied Cognitive Psychology 13, 4 (aug 1999), 351--371. http://dx.doi.org/10.1002/(SICI)1099-0720(199908)13:4<351::AID-ACP589>3.0.CO;2--6

[16]

Yea-Seul Kim, Mira Dontcheva, Eytan Adar, and Jessica Hullman. 2019. Vocal Shortcuts for Creative Experts. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1--14. http://dx.doi.org/10.1145/3290605.3300562

Digital Library

[17]

Benjamin Lafreniere, Andrea Bunt, and Michael Terry. 2014. Task-centric interfaces for feature-rich software. In Proceedings of the 26th Australian Computer-Human Interaction Conference - OzCHI '14. ACM Press, New York, NY, USA, 49--58. http://dx.doi.org/10.1145/2686612.2686620

Digital Library

[18]

Gierad P. Laput, Mira Dontcheva, Gregg Wilensky, Walter Chang, Aseem Agarwala, Jason Linder, and Eytan Adar. 2013. PixelTone: a multimodal interface for image editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI '13. ACM Press, New York, NY, USA, 2185. http://dx.doi.org/10.1145/2470654.2481301

Digital Library

[19]

Joseph J. LaViola Jr., Sarah Buchanan, and Corey Pittman. 2014. Multimodal Input for Perceptual User Interfaces. In Interactive Displays. John Wiley & Sons, Ltd, Chichester, UK, 285--312. http://dx.doi.org/10.1002/9781118706237.ch9

[20]

Toby Jia-Jun Li, Amos Azaria, and Brad A. Myers. 2017. SUGILITE: Creating Multimodal Smartphone Automation by Demonstration. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI '17. ACM Press, New York, NY, USA, 6038--6049. http://dx.doi.org/10.1145/3025453.3025483

[21]

Justin Matejka, Tovi Grossman, and George Fitzmaurice. 2011a. Ambient help. In Proceedings of the 2011 annual conference on Human factors in computing systems - CHI '11. ACM Press, New York, NY, USA, 2751. http://dx.doi.org/10.1145/1978942.1979349

Digital Library

[22]

Justin Matejka, Tovi Grossman, and George Fitzmaurice. 2011b. IP-QAT: in-product questions, answers, & tips. In Proceedings of the 24th annual ACM symposium on User interface software and technology - UIST '11. ACM Press, New York, NY, USA, 175. http://dx.doi.org/10.1145/2047196.2047218

Digital Library

[23]

Sven Mayer, Gierad Laput, and Chris Harrison. 2020. Enhancing Mobile Voice Assistants with WorldGaze. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems - CHI '20. ACM Press, New York, NY, USA, 1--10. http://dx.doi.org/10.1145/3313831.3376479

Digital Library

[24]

Rishabh Mehrotra, Ahmed Hassan Awadallah, Ahmed El Kholy, and Imed Zitouni. 2016. Hey Cortana! Exploring the use cases of a Desktop based Digital Assistant. In Proceedings of ACM, Tokyo, Japan, August 2017 (CAIR'17). 5.

[25]

Naomi Miyake and Donald A. Norman. 1979. To ask a question, one must know enough to know what is not known. Journal of Verbal Learning and Verbal Behavior 18, 3 (jun 1979), 357--364. http://dx.doi.org/10.1016/S0022--5371(79)90200--7

[26]

Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for How Users Overcome Obstacles in Voice User Interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI '18. ACM Press, New York, NY, USA, 1--7. http://dx.doi.org/10.1145/3173574.3173580

Digital Library

[27]

David G. Novick, Oscar D. Andrade, and Nathaniel Bean. 2009. The micro-structure of use of help. In Proceedings of the 27th ACM international conference on Design of communication - SIGDOC '09. ACM Press, New York, NY, USA, 97. http://dx.doi.org/10.1145/1621995.1622014

Digital Library

[28]

Sharon Oviatt. 1999. Ten myths of multimodal interaction. Commun. ACM 42, 11 (nov 1999), 74--81. http://dx.doi.org/10.1145/319382.319398

Digital Library

[29]

Tim Paek, Bo Thiesson, Yun-Cheng Ju, and Bongshin Lee. 2008. Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search. In Proceedings of the 21st annual ACM symposium on User interface software and technology - UIST '08. ACM Press, New York, NY, USA, 141. http://dx.doi.org/10.1145/1449715.1449738

Digital Library

[30]

Randy Pausch and James H. Leatherby. 1991. An Empirical Study: Adding Voice Input to a Graphical Editor. Journal of the American Voice Input/Output Society 9 (1991), 2--55.

[31]

Leah M. Reeves, Jean-Claude Martin, Michael McTear, TV Raman, Kay M. Stanney, Hui Su, Qian Ying Wang, Jennifer Lai, James A. Larson, Sharon Oviatt, T. S. Balaji, Sté phanie Buisine, Penny Collings, Phil Cohen, and Ben Kraal. 2004. Guidelines for multimodal user interface design. Commun. ACM 47, 1 (jan 2004), 57. http://dx.doi.org/10.1145/962081.962106

Digital Library

[32]

Daniel M. Russell. 2011. Making the Most of Online Searches. APS Observer 24, 4 (apr 2011). https://www.psychologicalscience.org/observer/making-the-most-of-online-searches

[33]

Jana Sedivy and Hilary Johnson. 1999. Supporting creative work tasks: The potential of multimodal tools to support sketching. In Proceedings of the third conference on Creativity & Cognition - C&C '99. ACM Press, New York, NY, USA, 42--49. http://dx.doi.org/10.1145/317561.317571

Digital Library

[34]

Vidya Setlur, Sarah E. Battersby, Melanie Tory, Rich Gossweiler, and Angel X. Chang. 2016. Eviza: A Natural Language Interface for Visual Analysis. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology - UIST '16. ACM Press, New York, NY, USA, 365--377. http://dx.doi.org/10.1145/2984511.2984588

[35]

Anirudh Sharma, Sriganesh Madhvanath, Ankit Shekhawat, and Mark Billinghurst. 2011. MozArt: a multimodal interface for conceptual 3D modeling. In Proceedings of the 13th international conference on multimodal interfaces - ICMI '11. ACM Press, New York, NY, USA, 307--310. http://dx.doi.org/10.1145/2070481.2070538

Digital Library

[36]

Milad Shokouhi, Rosie Jones, Umut Ozertem, Karthik Raghunathan, and Fernando Diaz. 2014. Mobile query reformulations. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14. ACM Press, New York, NY, USA, 1011--1014. http://dx.doi.org/10.1145/2600428.2609497

Digital Library

[37]

Laton Vermette, Parmit Chilana, Michael Terry, Adam Fourney, Ben Lafreniere, and Travis Kerr. 2015. CheatSheet: A Contextual Interactive Memory Aid for Web Applications. In Proceedings of the 41st Graphics Interface Conference (GI '15). Canadian Information Processing Society, Canada, 241--248.

Digital Library

Cited By

Wang SKim HJanaka NYue KNguyen HZhao SLiu HLe K(2024)"What's this?": Understanding User Interaction Behaviour with Multimodal Input Information Retrieval SystemAdjunct Proceedings of the 26th International Conference on Mobile Human-Computer Interaction10.1145/3640471.3680230(1-7)Online publication date: 21-Sep-2024
https://dl.acm.org/doi/10.1145/3640471.3680230
Ponochevnyi NKuzminykh A(2024)Chart What I Say: Exploring Cross-Modality Prompt Alignment in AI-Assisted Chart AuthoringExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650921(1-7)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650921
Stemasov EDemharter SRädler MGugenheimer JRukzio E(2024)pARam: Leveraging Parametric Design in Extended Reality to Support the Personalization of Artifacts for Personal FabricationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642083(1-22)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642083
Show More Cited By

Index Terms

ReMap: Lowering the Barrier to Help-Seeking with Multimodal Search
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction devices
      1. Sound-based input / output
    2. Interaction paradigms
      1. Graphical user interfaces

Recommendations

ReMap: Multimodal Help-Seeking
UIST '19 Adjunct: Adjunct Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology

ReMap is a multimodal interface that enables searching for learning videos using speech and in-task pointing. ReMap extends multimodal interaction to help-seeking for complex tasks. Users can speak search queries, adding app-specific terms deictically. ...
Combining deictic gestures and natural language for referent identification
COLING '86: Proceedings of the 11th coference on Computational linguistics

In virtually all current natural-language dialog systems, users can only refer to objexts by using linguistic descriptions. However, in human face-to-face conversation, participants frequently use various sorts of deictic gestures as well. In this paper,...
Facilitative effects of communicative gaze and speech in human-robot cooperation
AFFINE '10: Proceedings of the 3rd international workshop on Affective interaction in natural environments

Human interaction in natural environments relies on a variety of perceptual cues to guide and stabilize the interaction. Humanoid robots are becoming increasingly refined in their sensorimotor capabilities, and thus should be able to manipulate and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

UIST '20: Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology

October 2020

1297 pages

ISBN:9781450375146

DOI:10.1145/3379337

General Chairs:
Shamsi Iqbal
Microsoft Research, USA
,
Karon MacLean
University of British Columbia, Canada
,
Program Chairs:
Fanny Chevalier
University of Toronto, Canada
,
Stefanie Mueller
MIT CSAIL, USA

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

UIST '20

Sponsor:

UIST '20: The 33rd Annual ACM Symposium on User Interface Software and Technology

October 20 - 23, 2020

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Upcoming Conference

UIST '25

Sponsor:
sigchi
sigchi

The 38th Annual ACM Symposium on User Interface Software and Technology

September 28 - October 1, 2025

Busan , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
753
Total Downloads

Downloads (Last 12 months)162
Downloads (Last 6 weeks)12

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wang SKim HJanaka NYue KNguyen HZhao SLiu HLe K(2024)"What's this?": Understanding User Interaction Behaviour with Multimodal Input Information Retrieval SystemAdjunct Proceedings of the 26th International Conference on Mobile Human-Computer Interaction10.1145/3640471.3680230(1-7)Online publication date: 21-Sep-2024
https://dl.acm.org/doi/10.1145/3640471.3680230
Ponochevnyi NKuzminykh A(2024)Chart What I Say: Exploring Cross-Modality Prompt Alignment in AI-Assisted Chart AuthoringExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650921(1-7)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650921
Stemasov EDemharter SRädler MGugenheimer JRukzio E(2024)pARam: Leveraging Parametric Design in Extended Reality to Support the Personalization of Artifacts for Personal FabricationProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642083(1-22)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642083
Palani SZhou YZhu SDow S(2022)InterWeave: Presenting Search Suggestions in Context Scaffolds Information Search and SynthesisProceedings of the 35th Annual ACM Symposium on User Interface Software and Technology10.1145/3526113.3545696(1-16)Online publication date: 29-Oct-2022
https://dl.acm.org/doi/10.1145/3526113.3545696
Stemasov EWagner TGugenheimer JRukzio E(2022)ShapeFindAR: Exploring In-Situ Spatial Search for Physical Artifact Retrieval using Mixed RealityProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3517682(1-12)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3491102.3517682
Kim TChoi DChoi YKim J(2022)Stylette: Styling the Web with Natural LanguageProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501931(1-17)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3491102.3501931
Masson DVermeulen JFitzmaurice GMatejka J(2022)Supercharging Trial-and-Error for Learning Complex Software ApplicationsProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501895(1-13)Online publication date: 29-Apr-2022
https://dl.acm.org/doi/10.1145/3491102.3501895
Rodrigues RNeves Madeira RCorreia N(2021)Studying Natural User Interfaces for Smart Video Annotation towards Ubiquitous EnvironmentsProceedings of the 20th International Conference on Mobile and Ubiquitous Multimedia10.1145/3490632.3490672(158-168)Online publication date: 5-Dec-2021
https://dl.acm.org/doi/10.1145/3490632.3490672
Ngoon TKim JKlemmer S(2021)Shöwn: Adaptive Conceptual Guidance Aids Example Use in Creative TasksProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462072(1834-1845)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1145/3461778.3462072
Chung JHe SAdar E(2021)The Intersection of Users, Roles, Interactions, and Technologies in Creativity Support ToolsProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462050(1817-1833)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1145/3461778.3462050

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents