skip to main content
10.1145/3448016.3452753acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper

Demonstrating Robust Voice Querying with MUVE: Optimally Visualizing Results of Phonetically Similar Queries

Published: 18 June 2021 Publication History

Abstract

Recently proposed voice query interfaces translate voice input into SQL queries. Unreliable speech recognition on top of the intrinsic challenges of text-to-SQL translation makes it hard to reliably interpret user input. We present MUVE (Multiplots for Voice quEries), a system for robust voice querying. MUVE reduces the impact of ambiguous voice queries by filling the screen with multiplots, capturing results of phonetically similar queries. It maps voice input to a probability distribution over query candidates, executes a selected subset of queries, and visualizes their results in a multiplot.
Our goal is to maximize probability to show the correct query result. Also, we want to optimize the visualization (e.g., by coloring a subset of likely results) in order to minimize expected time until users find the correct result. Via a user study, we validate a simple cost model estimating the latter overhead. The resulting optimization problem is NP-hard. We propose an exhaustive algorithm, based on integer programming, as well as a greedy heuristic. As shown in a corresponding user study, MUVE enables users to identify accurate results faster, compared to prior work.

Supplementary Material

MP4 File (3448016.3452753.mp4)
Recently proposed voice query interfaces translate voice input into SQL queries. Unreliable speech recognition on top of the intrinsic challenges of text-to-SQL translation makes it hard to reliably interpret user input. We present MUVE (Multiplots for Voice quEries), a system for robust voice querying. MUVE reduces the impact of ambiguous voice queries by filling the screen with multiplots, capturing results of phonetically similar queries. It maps voice input to a probability distribution over query candidates, executes a selected subset of queries, and visualizes their results in a multiplot. Our goal is to maximize the probability to show the result of the correct query. This is a hard optimization problem. We propose an exhaustive algorithm, based on integer programming, as well as a greedy heuristic. As shown in a corresponding user study, MUVEenables users to identify accurate results faster, compared to prior work. In our demonstration, participants can experiment with both approaches (accessible in an online interface) and try voice queries on different data sets.

References

[1]
Dharmil Chandarana, Vraj Shah, Arun Kumar, and Lawrence Saul. 2017. SpeakQL: towards speech-driven multi-modal querying. In HILDA. 1--6.
[2]
Matteo Francia, Enrico Gallinucci, and Matteo Golfarelli. 2020. Towards conversational OLAP . CEUR Workshop Proceedings, Vol. 2572 (2020), 6--15.
[3]
Tong Gao, Mira Dontcheva, Eytan Adar, Zhicheng Liu, and Karrie G Karahalios. 2015. Datatone: Managing ambiguity in natural language interfaces for data visualization. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology . 489--500.
[4]
Jiaqi Guo, Zecheng Zhan, Yan Gao, Yan Xiao, Jian-Guang Lou, Ting Liu, and Dongmei Zhang. 2019. Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation . (2019), 4524--4535. https://doi.org/10.18653/v1/p19--1444 arxiv: 1905.08205
[5]
Wonseok Hwang, Jinyeong Yim, Seunghyun Park, and Minjoon Seo. 2019. A comprehensive exploration on wikisql with table-aware word contextualization. arXiv preprint arXiv:1902.01069 (2019).
[6]
Fei Li and Hosagrahar V Jagadish. 2014. NaLIR: an interactive natural language interface for querying relational databases. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data . 709--712.
[7]
Xi Victoria Lin, Richard Socher, and Caiming Xiong. 2020. Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing . (2020), 4870--4888. https://doi.org/10.18653/v1/2020.findings-emnlp.438 arxiv: 2012.12627
[8]
Gabriel Lyons, Vinh Tran, Carsten Binnig, Ugur Cetintemel, and Tim Kraska. 2016. Making the case for Query-by-Voice with EchoQuery. In SIGMOD . 2129--2132.
[9]
Dominik Moritz, Chenglong Wang, Greg L Nelson, Halden Lin, Adam M Smith, Bill Howe, and Jeffrey Heer. 2018. Formalizing visualization design knowledge as constraints: Actionable and extensible models in draco. IEEE transactions on visualization and computer graphics, Vol. 25, 1 (2018), 438--448.
[10]
Deokgun Park, Steven M Drucker, Roland Fernandez, and Niklas Elmqvist. 2017. Atom: A grammar for unit visualizations. IEEE transactions on visualization and computer graphics, Vol. 24, 12 (2017), 3032--3043.
[11]
Diptikalyan Saha, Avrilia Floratou, Karthik Sankaranarayanan, Umar Farooq Minhas, Ashish R Mittal, and Fatma Ozcan. 2016. ATHENA: An ontology-driven system for natural language querying over relational data stores . VLDB, Vol. 9, 12 (2016), 1209--1220.
[12]
A. Satyanarayan, D. Moritz, K. Wongsuphasawat, and J. Heer. 2017. Vega-Lite: A Grammar of Interactive Graphics. IEEE Transactions on Visualization and Computer Graphics, Vol. 23, 1 (2017), 341--350. https://doi.org/10.1109/TVCG.2016.2599030
[13]
Jaydeep Sen, Chuan Lei, Abdul Quamar, Fatma Ö zcan, Vasilis Efthymiou, Ayushi Dalmia, Greg Stager, Ashish Mittal, Diptikalyan Saha, and Karthik Sankaranarayanan. 2020. ATHENA
[14]
: natural language querying for complex nested SQL queries . Proceedings of the VLDB Endowment, Vol. 13, 12 (2020), 2747--2759. https://doi.org/10.14778/3407790.3407858
[15]
Jaydeep Sen, Greg Stager, Chuan Lei, Fatma Ozcan, Ashish Mittal, Diptikalyan Saha, Abdul Quamar, Manasa Jammi, and Karthik Sankaranarayanan. 2019. Natural language querying of complex business intelligence queries . Proceedings of the ACM SIGMOD International Conference on Management of Data (2019), 1997--2000. https://doi.org/10.1145/3299869.3320248
[16]
Vraj Shah, Side Li, Arun Kumar, and Lawrence Saul. 2019 a. SpeakQL: towards speech-driven multimodal querying of structured data. Technical Report. 1--16 pages.
[17]
Vraj Shah, Side Li, Kevin Yang, Arun Kumar, and Lawrence Saul. 2019 b. Demonstration of SpeakQL: speech-driven multimodal querying of structured data. In SIGMOD Demo Track. 2001--2004.
[18]
Chris Stolte, Diane Tang, and Pat Hanrahan. 2002. Polaris: A system for query, analysis, and visualization of multidimensional relational databases. IEEE Transactions on Visualization and Computer Graphics, Vol. 8, 1 (2002), 52--65.
[19]
Immanuel Trummer. 2019. Data Vocalization with CiceroDB. In CIDR .
[20]
Immanuel Trummer, Yicheng Wang, and Saketh Mahankali. 2019. A holistic approach for query evaluation and result vocalization in voice-based OLAP. In SIGMOD. 936--953.
[21]
Qianrui Zhang, Haoci Zhang, Thibault Sellam, and Eugene Wu. 2019. Mining precision interfaces from query logs. In Proceedings of the 2019 International Conference on Management of Data. 988--1005.
[22]
Victor Zhong, Caiming Xiong, and Richard Socher. 2017. Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning . (2017), 1--12. arxiv: 1709.00103 http://arxiv.org/abs/1709.00103

Cited By

View all

Index Terms

  1. Demonstrating Robust Voice Querying with MUVE: Optimally Visualizing Results of Phonetically Similar Queries

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD '21: Proceedings of the 2021 International Conference on Management of Data
      June 2021
      2969 pages
      ISBN:9781450383431
      DOI:10.1145/3448016
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 18 June 2021

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. data processing
      2. visualization planning
      3. voice query disambiguation

      Qualifiers

      • Short-paper

      Conference

      SIGMOD/PODS '21
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)16
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 01 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media