skip to main content
10.1145/3379337.3415592acmconferencesArticle/Chapter ViewAbstractPublication PagesuistConference Proceedingsconference-collections
research-article
Open access

ReMap: Lowering the Barrier to Help-Seeking with Multimodal Search

Published: 20 October 2020 Publication History

Abstract

People often seek help online while using complex software. Currently, information search takes users' attention away from the task at hand by creating a separate search task. This paper investigates how multimodal interaction can make in-task help-seeking easier and faster. We introduce ReMap, a multimodal search interface that helps users find video assistance while using desktop and web applications. Users can speak search queries, add application-specific terms deictically (e.g., "how to erase this"), and navigate search results via speech, all without taking their hands (or mouse) off their current task. Thirteen participants who used ReMap in the lab found that it helped them stay focused on their task while simultaneously searching for and using learning videos. Users' experiences with ReMap also raised a number of important challenges with implementing system-wide context-aware multimodal assistance.

Supplementary Material

VTT File (ufp5388pv.vtt)
VTT File (ufp5388vf.vtt)
VTT File (3379337.3415592.vtt)
SRT File (ufp5388pvc.srt)
Preview video captions
SRT File (ufp5388vfc.srt)
Video figure captions
ZIP File (ufp5388aux.zip)
remap.srt - Subtitle file for the full-length video.
MP4 File (ufp5388pv.mp4)
Preview video
MP4 File (ufp5388vf.mp4)
Video figure
MP4 File (3379337.3415592.mp4)
Presentation Video

References

[1]
2019. Top Global Consumer Trends in 2020. (2019). https://www.globalwebindex.com/reports/trends-2020
[2]
Richard A. Bolt. 1980. “Put-that-there”: Voice and gesture at the graphics interface. ACM SIGGRAPH Computer Graphics 14, 3 (jul 1980), 262--270. http://dx.doi.org/10.1145/965105.807503
[3]
Horatiu Bota, Adam Fourney, Susan T. Dumais, Tomasz L. Religa, and Robert Rounthwaite. 2018. Characterizing Search Behavior in Productivity Software. In Proceedings of the 2018 Conference on Human Information Interaction & Retrieval - CHIIR '18. ACM Press, New York, NY, USA, 160--169. http://dx.doi.org/10.1145/3176349.3176395
[4]
Joel Brandt, Mira Dontcheva, Marcos Weskamp, and Scott R. Klemmer. 2010. Example-centric programming: Integrating Web Search into the Development Environment. In Proceedings of the 28th international conference on Human factors in computing systems - CHI '10. ACM Press, New York, NY, USA, 513. http://dx.doi.org/10.1145/1753326.1753402
[5]
Minsuk Chang, Anh Truong, Oliver Wang, Maneesh Agrawala, and Juho Kim. 2019. How to Design Voice Based Navigation for How-To Videos. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1--11. http://dx.doi.org/10.1145/3290605.3300931
[6]
Morgan Dixon and James Fogarty. 2010. Prefab: Implementing advanced behaviors using pixel-based reverse engineering of interface structure. In Proceedings of the 28th international conference on Human factors in computing systems - CHI '10. ACM Press, New York, NY, USA, 1525. http://dx.doi.org/10.1145/1753326.1753554
[7]
Michael Ekstrand, Wei Li, Tovi Grossman, Justin Matejka, and George Fitzmaurice. 2011. Searching for software learning resources using application context. In Proceedings of the 24th annual ACM symposium on User interface software and technology - UIST '11. ACM Press, New York, NY, USA, 195. http://dx.doi.org/10.1145/2047196.2047220
[8]
Adam Fourney, Ben Lafreniere, Parmit Chilana, and Michael Terry. 2014. InterTwine: creating interapplication information scent to support coordinated use of software. In Proceedings of the 27th annual ACM symposium on User interface software and technology - UIST '14. ACM Press, New York, NY, USA, 429--438. http://dx.doi.org/10.1145/2642918.2647420
[9]
C. Ailie Fraser, Tricia J. Ngoon, Mira Dontcheva, and Scott Klemmer. 2019. RePlay: Contextually Presenting Learning Videos Across Software Applications. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1--13. http://dx.doi.org/10.1145/3290605.3300527
[10]
Masaaki Fukumoto. 2018. SilentVoice: Unnoticeable voice input by ingressive speech. In Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology - UIST '18. ACM Press, New York, NY, USA, 237--246. http://dx.doi.org/10.1145/3242587.3242603
[11]
Tovi Grossman and George Fitzmaurice. 2010. ToolClips: An Investigation of Contextual Video Assistance for Functionality Understanding. In Proceedings of the 28th international conference on Human factors in computing systems - CHI '10. ACM Press, New York, NY, USA, 1515. http://dx.doi.org/10.1145/1753326.1753552
[12]
Ido Guy. 2018. The Characteristics of Voice Search: Comparing Spoken with Typed-in Mobile Web Search Queries. ACM Transactions on Information Systems 36, 3 (apr 2018), 1--28. http://dx.doi.org/10.1145/3182163
[13]
Amy Hurst, Scott E. Hudson, and Jennifer Mankoff. 2010. Automatically identifying targets users interact with during real world tasks. In Proceedings of the 15th international conference on Intelligent user interfaces - IUI '10. ACM Press, New York, NY, USA, 11. http://dx.doi.org/10.1145/1719970.1719973
[14]
Jiepu Jiang, Wei Jeng, and Daqing He. 2013. How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search. In Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval - SIGIR '13. ACM Press, New York, NY, USA, 143. http://dx.doi.org/10.1145/2484028.2484092
[15]
Slava Kalyuga, Paul Chandler, and John Sweller. 1999. Managing split-attention and redundancy in multimedia instruction. Applied Cognitive Psychology 13, 4 (aug 1999), 351--371. http://dx.doi.org/10.1002/(SICI)1099-0720(199908)13:4<351::AID-ACP589>3.0.CO;2--6
[16]
Yea-Seul Kim, Mira Dontcheva, Eytan Adar, and Jessica Hullman. 2019. Vocal Shortcuts for Creative Experts. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems - CHI '19. ACM Press, New York, NY, USA, 1--14. http://dx.doi.org/10.1145/3290605.3300562
[17]
Benjamin Lafreniere, Andrea Bunt, and Michael Terry. 2014. Task-centric interfaces for feature-rich software. In Proceedings of the 26th Australian Computer-Human Interaction Conference - OzCHI '14. ACM Press, New York, NY, USA, 49--58. http://dx.doi.org/10.1145/2686612.2686620
[18]
Gierad P. Laput, Mira Dontcheva, Gregg Wilensky, Walter Chang, Aseem Agarwala, Jason Linder, and Eytan Adar. 2013. PixelTone: a multimodal interface for image editing. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI '13. ACM Press, New York, NY, USA, 2185. http://dx.doi.org/10.1145/2470654.2481301
[19]
Joseph J. LaViola Jr., Sarah Buchanan, and Corey Pittman. 2014. Multimodal Input for Perceptual User Interfaces. In Interactive Displays. John Wiley & Sons, Ltd, Chichester, UK, 285--312. http://dx.doi.org/10.1002/9781118706237.ch9
[20]
Toby Jia-Jun Li, Amos Azaria, and Brad A. Myers. 2017. SUGILITE: Creating Multimodal Smartphone Automation by Demonstration. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems - CHI '17. ACM Press, New York, NY, USA, 6038--6049. http://dx.doi.org/10.1145/3025453.3025483
[21]
Justin Matejka, Tovi Grossman, and George Fitzmaurice. 2011a. Ambient help. In Proceedings of the 2011 annual conference on Human factors in computing systems - CHI '11. ACM Press, New York, NY, USA, 2751. http://dx.doi.org/10.1145/1978942.1979349
[22]
Justin Matejka, Tovi Grossman, and George Fitzmaurice. 2011b. IP-QAT: in-product questions, answers, & tips. In Proceedings of the 24th annual ACM symposium on User interface software and technology - UIST '11. ACM Press, New York, NY, USA, 175. http://dx.doi.org/10.1145/2047196.2047218
[23]
Sven Mayer, Gierad Laput, and Chris Harrison. 2020. Enhancing Mobile Voice Assistants with WorldGaze. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems - CHI '20. ACM Press, New York, NY, USA, 1--10. http://dx.doi.org/10.1145/3313831.3376479
[24]
Rishabh Mehrotra, Ahmed Hassan Awadallah, Ahmed El Kholy, and Imed Zitouni. 2016. Hey Cortana! Exploring the use cases of a Desktop based Digital Assistant. In Proceedings of ACM, Tokyo, Japan, August 2017 (CAIR'17). 5.
[25]
Naomi Miyake and Donald A. Norman. 1979. To ask a question, one must know enough to know what is not known. Journal of Verbal Learning and Verbal Behavior 18, 3 (jun 1979), 357--364. http://dx.doi.org/10.1016/S0022--5371(79)90200--7
[26]
Chelsea Myers, Anushay Furqan, Jessica Nebolsky, Karina Caro, and Jichen Zhu. 2018. Patterns for How Users Overcome Obstacles in Voice User Interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems - CHI '18. ACM Press, New York, NY, USA, 1--7. http://dx.doi.org/10.1145/3173574.3173580
[27]
David G. Novick, Oscar D. Andrade, and Nathaniel Bean. 2009. The micro-structure of use of help. In Proceedings of the 27th ACM international conference on Design of communication - SIGDOC '09. ACM Press, New York, NY, USA, 97. http://dx.doi.org/10.1145/1621995.1622014
[28]
Sharon Oviatt. 1999. Ten myths of multimodal interaction. Commun. ACM 42, 11 (nov 1999), 74--81. http://dx.doi.org/10.1145/319382.319398
[29]
Tim Paek, Bo Thiesson, Yun-Cheng Ju, and Bongshin Lee. 2008. Search Vox: leveraging multimodal refinement and partial knowledge for mobile voice search. In Proceedings of the 21st annual ACM symposium on User interface software and technology - UIST '08. ACM Press, New York, NY, USA, 141. http://dx.doi.org/10.1145/1449715.1449738
[30]
Randy Pausch and James H. Leatherby. 1991. An Empirical Study: Adding Voice Input to a Graphical Editor. Journal of the American Voice Input/Output Society 9 (1991), 2--55.
[31]
Leah M. Reeves, Jean-Claude Martin, Michael McTear, TV Raman, Kay M. Stanney, Hui Su, Qian Ying Wang, Jennifer Lai, James A. Larson, Sharon Oviatt, T. S. Balaji, Sté phanie Buisine, Penny Collings, Phil Cohen, and Ben Kraal. 2004. Guidelines for multimodal user interface design. Commun. ACM 47, 1 (jan 2004), 57. http://dx.doi.org/10.1145/962081.962106
[32]
Daniel M. Russell. 2011. Making the Most of Online Searches. APS Observer 24, 4 (apr 2011). https://www.psychologicalscience.org/observer/making-the-most-of-online-searches
[33]
Jana Sedivy and Hilary Johnson. 1999. Supporting creative work tasks: The potential of multimodal tools to support sketching. In Proceedings of the third conference on Creativity & Cognition - C&C '99. ACM Press, New York, NY, USA, 42--49. http://dx.doi.org/10.1145/317561.317571
[34]
Vidya Setlur, Sarah E. Battersby, Melanie Tory, Rich Gossweiler, and Angel X. Chang. 2016. Eviza: A Natural Language Interface for Visual Analysis. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology - UIST '16. ACM Press, New York, NY, USA, 365--377. http://dx.doi.org/10.1145/2984511.2984588
[35]
Anirudh Sharma, Sriganesh Madhvanath, Ankit Shekhawat, and Mark Billinghurst. 2011. MozArt: a multimodal interface for conceptual 3D modeling. In Proceedings of the 13th international conference on multimodal interfaces - ICMI '11. ACM Press, New York, NY, USA, 307--310. http://dx.doi.org/10.1145/2070481.2070538
[36]
Milad Shokouhi, Rosie Jones, Umut Ozertem, Karthik Raghunathan, and Fernando Diaz. 2014. Mobile query reformulations. In Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval - SIGIR '14. ACM Press, New York, NY, USA, 1011--1014. http://dx.doi.org/10.1145/2600428.2609497
[37]
Laton Vermette, Parmit Chilana, Michael Terry, Adam Fourney, Ben Lafreniere, and Travis Kerr. 2015. CheatSheet: A Contextual Interactive Memory Aid for Web Applications. In Proceedings of the 41st Graphics Interface Conference (GI '15). Canadian Information Processing Society, Canada, 241--248.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
UIST '20: Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology
October 2020
1297 pages
ISBN:9781450375146
DOI:10.1145/3379337
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 October 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. contextual search
  2. deixis
  3. multimodal search
  4. speech

Qualifiers

  • Research-article

Conference

UIST '20

Acceptance Rates

Overall Acceptance Rate 561 of 2,567 submissions, 22%

Upcoming Conference

UIST '25
The 38th Annual ACM Symposium on User Interface Software and Technology
September 28 - October 1, 2025
Busan , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)162
  • Downloads (Last 6 weeks)12
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media