short-paper

Interactive Query Clarification and Refinement via User Simulation

Authors:

Pierre Erbacher,

Ludovic Denoyer,

Laure SoulierAuthors Info & Claims

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2420 - 2425

https://doi.org/10.1145/3477495.3531871

Published: 07 July 2022 Publication History

Abstract

When users initiate search sessions, their query are often ambiguous or might lack of context; this resulting in non-efficient document ranking. Multiple approaches have been proposed by the Information Retrieval community to add context and retrieve documents aligned with users' intents. While some work focus on query disambiguation using users' browsing history, a recent line of work proposes to interact with users by asking clarification questions or/and proposing clarification panels. However, these approaches count either a limited number (i.e., 1) of interactions with user or log-based interactions. In this paper, we propose and evaluate a fully simulated query clarification framework allowing multi-turn interactions between IR systems and user agents.

Supplementary Material

MP4 File (SIGIR22-sp1886.mp4)

When users initiate search sessions, their query are often ambiguous or might lack of context; this resulting in non-efficient document ranking. Multiple approaches have been proposed by the Information Retrieval community to add context and retrieve documents aligned with users' intents. While some work focus on query disambiguation using users' browsing history, a recent line of work proposes to interact with users by asking clarification questions or/and proposing clarification panels. However, these approaches count either a limited number (i.e., 1) of interactions with user or log-based interactions. In this paper, we propose and evaluate a fully simulated query clarification framework allowing multi-turn interactions between IR systems and user agents.

Download
51.74 MB

References

[1]

Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, and Samuel Ieong. 2009. Diversifying Search Results. In Proceedings of the Second ACM International Conference on Web Search and Data Mining (Barcelona, Spain) (WSDM '09). Association for Computing Machinery, New York, NY, USA, 5--14. https://doi.org/10.1145/1498759.1498766

Digital Library

[2]

Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W. Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR'19). Association for Computing Machinery, New York, NY, USA, 475--484. https://doi.org/10.1145/3331184.3331265

Digital Library

[3]

Gianni Amati and Cornelis Joost Van Rijsbergen. 2002. Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Trans. Inf. Syst. 20, 4 (oct 2002), 357--389. https://doi.org/10.1145/582415.582416

Digital Library

[4]

Paul N. Bennett, Ryen W. White, Wei Chu, Susan T. Dumais, Peter Bailey, Fedor Borisyuk, and Xiaoyuan Cui. 2012. Modeling the impact of short- and long-term behavior on search personalization. In SIGIR '12.

Digital Library

[5]

Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to Rank Using Gradient Descent. In Proceedings of the 22nd International Conference on Machine Learning (Bonn, Germany) (ICML '05). Association for Computing Machinery, New York, NY, USA, 89--96. https://doi.org/10.1145/1102351.1102363

Digital Library

[6]

Fei Cai, Ridho Reinanda, and Maarten De Rijke. 2016. Diversifying Query Auto- Completion. ACM Trans. Inf. Syst. 34, 4, Article 25 (jun 2016), 33 pages. https://doi.org/10.1145/2910579

Digital Library

[7]

Arthur Câmara, David Maxwell, and Claudia Hauff. 2022. Searching, Learning, and Subtopic Ordering: A Simulation-based Analysis. CoRR abs/2201.11181 (2022). arXiv:2201.11181 https://arxiv.org/abs/2201.11181

[8]

Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Melbourne, Australia) (SIGIR '98). Association for Computing Machinery, New York, NY, USA, 335--336. https://doi.org/10.1145/290941.291025

Digital Library

[9]

Jerry Zikun Chen, Shih Yuan Yu, and HaoranWang. 2020. Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning. ArXiv abs/2012.10033 (2020).

[10]

Steve Cronen-Townsend andW. Bruce Croft. 2002. Quantifying Query Ambiguity. In Proceedings of the Second International Conference on Human Language Technology Research (San Diego, California) (HLT '02). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 104--109.

[11]

Jeffrey Dalton, Chenyan Xiong, and Jamie Callan. 2020. CAsT 2020: The Conversational Assistance Track Overview. In Proceedings of the Twenty-Ninth Text REtrieval Conference, TREC 2020, Virtual Event [Gaithersburg, Maryland, USA], November 16--20, 2020 (NIST Special Publication, Vol. 1266), Ellen M. Voorhees and Angela Ellis (Eds.). National Institute of Standards and Technology (NIST). https://trec.nist.gov/pubs/trec29/papers/OVERVIEW.C.pdf

[12]

Van Dang and W. Bruce Croft. 2012. Diversity by Proportionality: An Election-Based Approach to Search Result Diversification. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (Portland, Oregon, USA) (SIGIR '12). Association for Computing Machinery, New York, NY, USA, 65--74. https://doi.org/10.1145/2348283.2348296

Digital Library

[13]

Ahmed Elgohary, Denis Peskov, and Jordan Boyd-Graber. 2019. Can You Unpack That? Learning to Rewrite Questions-in-Context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP). Association for Computational Linguistics, Hong Kong, China, 5918--5924. https://doi.org/10.18653/v1/D19--1605

[14]

J. Guo, Xueqi Cheng, Gu Xu, and Xiaofei Zhu. 2011. Intent-aware query similarity. In CIKM '11.

Digital Library

[15]

Morgan Harvey, Fabio A. Crestani, and Mark James Carman. 2013. Building user profiles from topic models for personalised search. Proceedings of the 22nd ACM international conference on Information & Knowledge Management (2013).

Digital Library

[16]

Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. 2021. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling. Association for Computing Machinery, New York, NY, USA, 113--122. https://doi.org/10.1145/3404835.3462891

Digital Library

[17]

Bernard J. Jansen, Amanda Spink, and Tefko Saracevic. 2000. Real life, real users, and real needs: A study and analysis of user queries on the Web. Information Processing and Management 36, 2 (1 March 2000), 207--227. https://doi.org/10.1016/S0306--4573(99)00056--4

[18]

Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data 7, 3 (2021), 535--547. https://doi.org/10.1109/TBDATA.2019.2921572

[19]

Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. Association for Computing Machinery, New York, NY, USA, 39--48. https://doi.org/10.1145/3397271.3401075

Digital Library

[20]

Weize Kong, Rui Li, Jie Luo, Aston Zhang, Yi Chang, and James Allan. 2015. Predicting Search Intent Based on Pre-Search Context. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (Santiago, Chile) (SIGIR '15). Association for Computing Machinery, New York, NY, USA, 503--512. https://doi.org/10.1145/2766462.2767757

Digital Library

[21]

Victor Lavrenko and W. Bruce Croft. 2001. Relevance Based Language Models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (New Orleans, Louisiana, USA) (SIGIR'01). Association for Computing Machinery, New York, NY, USA, 120--127. https://doi.org/10.1145/383952.383972

Digital Library

[22]

Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy J. Lin. 2020. TREC 2020 Notebook: CAsT Track. In TREC.

[23]

Sean MacAvaney, Craig Macdonald, Roderick Murray-Smith, and Iadh Ounis. 2021. IntenT5: Search Result Diversification using Causal Language Models. CoRR abs/2108.04026 (2021). arXiv:2108.04026 https://arxiv.org/abs/2108.04026

[24]

Nicolaas Matthijs and Filip Radlinski. 2011. PersonalizingWeb Search Using Long Term Browsing History (WSDM '11). Association for Computing Machinery, New York, NY, USA, 25--34. https://doi.org/10.1145/1935826.1935840

Digital Library

[25]

Rodrigo Nogueira, Jannis Bulian, and Massimiliano Ciaramita. 2019. Multiagent query reformulation: Challenges and the role of diversity. In DeepRLStruct-Pred@ICLR.

[26]

Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, and Jimmy J. Lin. 2019. Multi-Stage Document Ranking with BERT. ArXiv abs/1910.14424 (2019).

[27]

Rodrigo Nogueira,Wei Yang, Jimmy J. Lin, and Kyunghyun Cho. 2019. Document Expansion by Query Prediction. ArXiv abs/1904.08375 (2019).

[28]

Umut Ozertem, Olivier Chapelle, Pinar Donmez, and Emre Velipasaoglu. 2012. Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (Portland, Oregon, USA) (SIGIR'12). Association for Computing Machinery, New York, NY, USA, 25--34. https://doi.org/10.1145/2348283.2348290

Digital Library

[29]

Dipasree Pal, Mandar Mitra, and Kalyankumar Datta. 2013. Query Expansion Using Term Distribution and Term Association. CoRR abs/1303.0667 (2013). arXiv:1303.0667 http://arxiv.org/abs/1303.0667

[30]

Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. 2021. The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models. CoRR abs/2101.05667 (2021). arXiv:2101.05667 https://arxiv.org/abs/2101.05667

[31]

Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1--67. http://jmlr.org/papers/v21/20-074.html

[32]

Sudha Rao and Hal Daumé III. 2019. Answer-based Adversarial Training for Generating Clarification Questions. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 143--155.

[33]

Joseph John Rocchio. 1971. Relevance feedback in information retrieval. Gerard Salton, editor, The SMART Retrieval System - Experiments in Automatic Document Processing (1971), 313--323.

[34]

Mark Sanderson. 2008. Ambiguous Queries: Test Collections Need More Sense. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Singapore, Singapore) (SIGIR '08). Association for Computing Machinery, New York, NY, USA, 499--506. https://doi.org/10.1145/1390334.1390420

Digital Library

[35]

Rodrygo L. T. Santos, Craig MacDonald, and Iadh Ounis. 2012. Learning to rank query suggestions for adhoc and diversity search. Information Retrieval 16 (2012), 429--451.

Digital Library

[36]

Ashwin K. Vijayakumar, Michael Cogswell, Ramprasaath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra. 2016. Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models. CoRR abs/1610.02424 (2016). arXiv:1610.02424 http://arxiv.org/abs/1610.02424

[37]

Jun Wang and Jianhan Zhu. 2009. Portfolio Theory of Information Retrieval. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Boston, MA, USA) (SIGIR '09). Association for Computing Machinery, New York, NY, USA, 115--122. https://doi.org/10.1145/1571941.1571963

Digital Library

[38]

Bin Wu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2018. Query Suggestion with Feedback Memory Network. Proceedings of the 2018 World Wide Web Conference (2018).

Digital Library

[39]

Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, and Hang Li. 2010. Context-aware ranking in web search. Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval (2010).

Digital Library

[40]

Jiacheng Ye, Tao Gui, Yichao Luo, Yige Xu, and Qi Zhang. 2021. One2Set: Generating Diverse Keyphrases as a Set. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4598--4608. https://doi.org/10.18653/v1/2021.acllong. 354.

[41]

Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating Clarifying Questions for Information Retrieval. Association for Computing Machinery, New York, NY, USA, 418--428. https://doi.org/10.1145/3366423.3380126

Digital Library

[42]

Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N. Bennett, Nick Craswell, and Susan T. Dumais. 2020. Analyzing and Learning from User Interactions for Search Clarification. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25--30, 2020, Jimmy Huang, Yi Chang, Xueqi Cheng, Jaap Kamps, Vanessa Murdock, Ji-Rong Wen, and Yiqun Liu (Eds.). ACM, 1181--1190. https://doi.org/10.1145/3397271.3401160

Digital Library

[43]

Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N. Bennett, Nick Craswell, and Susan T. Dumais. 2020. Analyzing and Learning from User Interactions for Search Clarification. CoRR abs/2006.00166 (2020). arXiv:2006.00166 https://arxiv.org/abs/2006.00166

[44]

Ingrid Zukerman and Bhavani Raskutti. 2002. Lexical Query Paraphrasing for Document Retrieval. In COLING 2002: The 19th International Conference on Computational Linguistics. https://aclanthology.org/C02--1161.

Cited By

Hu ZFeng YLuu AHooi BLipani AFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulators to Enhance Dialogue SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615220(3953-3957)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615220
Narayanan YFani HFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)RePair: An Extensible Toolkit to Generate Large-Scale Datasets for Query Refinement via TransformersProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615129(5376-5380)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615129
Zerhoudi SGünther SPlassmeier KBorst TSeifert CHagen MGranitzer MAl Hasan MXiong L(2022)The SimIIR 2.0 FrameworkProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557711(4661-4666)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557711

Index Terms

Interactive Query Clarification and Refinement via User Simulation
1. Information systems
  1. Information retrieval
    1. Users and interactive retrieval

Recommendations

Toward Voice Query Clarification
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval

Query suggestions are a standard means to clarify the intent of underspecified queries. In a voice-based search setting, the compilation of query suggestions is not straightforward, and user-centric research targeting query underspecification is lacking ...
Asking Multimodal Clarifying Questions in Mixed-Initiative Conversational Search
WWW '24: Proceedings of the ACM Web Conference 2024

In mixed-initiative conversational search systems, clarifying questions aid users who struggle to express their intentions in a single query. These questions aim to uncover user's information needs and resolve query ambiguities. We hypothesize that in ...
Search Clarification Selection via Query-Intent-Clarification Graph Attention
Advances in Information Retrieval
Abstract
Proactively asking clarifications in response to search queries is a useful technique for revealing the intent of the query. Search clarification is important for both web and conversational search. This paper focuses on the clarification ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2022

3569 pages

ISBN:9781450387323

DOI:10.1145/3477495

General Chairs:
Enrique Amigo
UNED
,
Pablo Castells
UAM and Amazon
,
Julio Gonzalo
UNED
,
Program Chairs:
Ben Carterette
Spotify
,
J. Shane Culpepper
RMIT University
,
Gabriella Kazai
Waseda University

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

ANR JCJC project SESAMS

Conference

SIGIR '22

Sponsor:

SIGIR

SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 11 - 15, 2022

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
164
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)14

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Hu ZFeng YLuu AHooi BLipani AFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulators to Enhance Dialogue SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615220(3953-3957)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615220
Narayanan YFani HFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)RePair: An Extensible Toolkit to Generate Large-Scale Datasets for Query Refinement via TransformersProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615129(5376-5380)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615129
Zerhoudi SGünther SPlassmeier KBorst TSeifert CHagen MGranitzer MAl Hasan MXiong L(2022)The SimIIR 2.0 FrameworkProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557711(4661-4666)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557711

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents