skip to main content
10.1145/3477495.3531871acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper

Interactive Query Clarification and Refinement via User Simulation

Published: 07 July 2022 Publication History

Abstract

When users initiate search sessions, their query are often ambiguous or might lack of context; this resulting in non-efficient document ranking. Multiple approaches have been proposed by the Information Retrieval community to add context and retrieve documents aligned with users' intents. While some work focus on query disambiguation using users' browsing history, a recent line of work proposes to interact with users by asking clarification questions or/and proposing clarification panels. However, these approaches count either a limited number (i.e., 1) of interactions with user or log-based interactions. In this paper, we propose and evaluate a fully simulated query clarification framework allowing multi-turn interactions between IR systems and user agents.

Supplementary Material

MP4 File (SIGIR22-sp1886.mp4)
When users initiate search sessions, their query are often ambiguous or might lack of context; this resulting in non-efficient document ranking. Multiple approaches have been proposed by the Information Retrieval community to add context and retrieve documents aligned with users' intents. While some work focus on query disambiguation using users' browsing history, a recent line of work proposes to interact with users by asking clarification questions or/and proposing clarification panels. However, these approaches count either a limited number (i.e., 1) of interactions with user or log-based interactions. In this paper, we propose and evaluate a fully simulated query clarification framework allowing multi-turn interactions between IR systems and user agents.

References

[1]
Rakesh Agrawal, Sreenivas Gollapudi, Alan Halverson, and Samuel Ieong. 2009. Diversifying Search Results. In Proceedings of the Second ACM International Conference on Web Search and Data Mining (Barcelona, Spain) (WSDM '09). Association for Computing Machinery, New York, NY, USA, 5--14. https://doi.org/10.1145/1498759.1498766
[2]
Mohammad Aliannejadi, Hamed Zamani, Fabio Crestani, and W. Bruce Croft. 2019. Asking Clarifying Questions in Open-Domain Information-Seeking Conversations. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Paris, France) (SIGIR'19). Association for Computing Machinery, New York, NY, USA, 475--484. https://doi.org/10.1145/3331184.3331265
[3]
Gianni Amati and Cornelis Joost Van Rijsbergen. 2002. Probabilistic Models of Information Retrieval Based on Measuring the Divergence from Randomness. ACM Trans. Inf. Syst. 20, 4 (oct 2002), 357--389. https://doi.org/10.1145/582415.582416
[4]
Paul N. Bennett, Ryen W. White, Wei Chu, Susan T. Dumais, Peter Bailey, Fedor Borisyuk, and Xiaoyuan Cui. 2012. Modeling the impact of short- and long-term behavior on search personalization. In SIGIR '12.
[5]
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to Rank Using Gradient Descent. In Proceedings of the 22nd International Conference on Machine Learning (Bonn, Germany) (ICML '05). Association for Computing Machinery, New York, NY, USA, 89--96. https://doi.org/10.1145/1102351.1102363
[6]
Fei Cai, Ridho Reinanda, and Maarten De Rijke. 2016. Diversifying Query Auto- Completion. ACM Trans. Inf. Syst. 34, 4, Article 25 (jun 2016), 33 pages. https://doi.org/10.1145/2910579
[7]
Arthur Câmara, David Maxwell, and Claudia Hauff. 2022. Searching, Learning, and Subtopic Ordering: A Simulation-based Analysis. CoRR abs/2201.11181 (2022). arXiv:2201.11181 https://arxiv.org/abs/2201.11181
[8]
Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Melbourne, Australia) (SIGIR '98). Association for Computing Machinery, New York, NY, USA, 335--336. https://doi.org/10.1145/290941.291025
[9]
Jerry Zikun Chen, Shih Yuan Yu, and HaoranWang. 2020. Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning. ArXiv abs/2012.10033 (2020).
[10]
Steve Cronen-Townsend andW. Bruce Croft. 2002. Quantifying Query Ambiguity. In Proceedings of the Second International Conference on Human Language Technology Research (San Diego, California) (HLT '02). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 104--109.
[11]
Jeffrey Dalton, Chenyan Xiong, and Jamie Callan. 2020. CAsT 2020: The Conversational Assistance Track Overview. In Proceedings of the Twenty-Ninth Text REtrieval Conference, TREC 2020, Virtual Event [Gaithersburg, Maryland, USA], November 16--20, 2020 (NIST Special Publication, Vol. 1266), Ellen M. Voorhees and Angela Ellis (Eds.). National Institute of Standards and Technology (NIST). https://trec.nist.gov/pubs/trec29/papers/OVERVIEW.C.pdf
[12]
Van Dang and W. Bruce Croft. 2012. Diversity by Proportionality: An Election-Based Approach to Search Result Diversification. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (Portland, Oregon, USA) (SIGIR '12). Association for Computing Machinery, New York, NY, USA, 65--74. https://doi.org/10.1145/2348283.2348296
[13]
Ahmed Elgohary, Denis Peskov, and Jordan Boyd-Graber. 2019. Can You Unpack That? Learning to Rewrite Questions-in-Context. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP). Association for Computational Linguistics, Hong Kong, China, 5918--5924. https://doi.org/10.18653/v1/D19--1605
[14]
J. Guo, Xueqi Cheng, Gu Xu, and Xiaofei Zhu. 2011. Intent-aware query similarity. In CIKM '11.
[15]
Morgan Harvey, Fabio A. Crestani, and Mark James Carman. 2013. Building user profiles from topic models for personalised search. Proceedings of the 22nd ACM international conference on Information & Knowledge Management (2013).
[16]
Sebastian Hofstätter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. 2021. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling. Association for Computing Machinery, New York, NY, USA, 113--122. https://doi.org/10.1145/3404835.3462891
[17]
Bernard J. Jansen, Amanda Spink, and Tefko Saracevic. 2000. Real life, real users, and real needs: A study and analysis of user queries on the Web. Information Processing and Management 36, 2 (1 March 2000), 207--227. https://doi.org/10.1016/S0306--4573(99)00056--4
[18]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2021. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data 7, 3 (2021), 535--547. https://doi.org/10.1109/TBDATA.2019.2921572
[19]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. Association for Computing Machinery, New York, NY, USA, 39--48. https://doi.org/10.1145/3397271.3401075
[20]
Weize Kong, Rui Li, Jie Luo, Aston Zhang, Yi Chang, and James Allan. 2015. Predicting Search Intent Based on Pre-Search Context. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (Santiago, Chile) (SIGIR '15). Association for Computing Machinery, New York, NY, USA, 503--512. https://doi.org/10.1145/2766462.2767757
[21]
Victor Lavrenko and W. Bruce Croft. 2001. Relevance Based Language Models. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (New Orleans, Louisiana, USA) (SIGIR'01). Association for Computing Machinery, New York, NY, USA, 120--127. https://doi.org/10.1145/383952.383972
[22]
Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy J. Lin. 2020. TREC 2020 Notebook: CAsT Track. In TREC.
[23]
Sean MacAvaney, Craig Macdonald, Roderick Murray-Smith, and Iadh Ounis. 2021. IntenT5: Search Result Diversification using Causal Language Models. CoRR abs/2108.04026 (2021). arXiv:2108.04026 https://arxiv.org/abs/2108.04026
[24]
Nicolaas Matthijs and Filip Radlinski. 2011. PersonalizingWeb Search Using Long Term Browsing History (WSDM '11). Association for Computing Machinery, New York, NY, USA, 25--34. https://doi.org/10.1145/1935826.1935840
[25]
Rodrigo Nogueira, Jannis Bulian, and Massimiliano Ciaramita. 2019. Multiagent query reformulation: Challenges and the role of diversity. In DeepRLStruct-Pred@ICLR.
[26]
Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, and Jimmy J. Lin. 2019. Multi-Stage Document Ranking with BERT. ArXiv abs/1910.14424 (2019).
[27]
Rodrigo Nogueira,Wei Yang, Jimmy J. Lin, and Kyunghyun Cho. 2019. Document Expansion by Query Prediction. ArXiv abs/1904.08375 (2019).
[28]
Umut Ozertem, Olivier Chapelle, Pinar Donmez, and Emre Velipasaoglu. 2012. Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions. In Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval (Portland, Oregon, USA) (SIGIR'12). Association for Computing Machinery, New York, NY, USA, 25--34. https://doi.org/10.1145/2348283.2348290
[29]
Dipasree Pal, Mandar Mitra, and Kalyankumar Datta. 2013. Query Expansion Using Term Distribution and Term Association. CoRR abs/1303.0667 (2013). arXiv:1303.0667 http://arxiv.org/abs/1303.0667
[30]
Ronak Pradeep, Rodrigo Nogueira, and Jimmy Lin. 2021. The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models. CoRR abs/2101.05667 (2021). arXiv:2101.05667 https://arxiv.org/abs/2101.05667
[31]
Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research 21, 140 (2020), 1--67. http://jmlr.org/papers/v21/20-074.html
[32]
Sudha Rao and Hal Daumé III. 2019. Answer-based Adversarial Training for Generating Clarification Questions. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2--7, 2019, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, 143--155.
[33]
Joseph John Rocchio. 1971. Relevance feedback in information retrieval. Gerard Salton, editor, The SMART Retrieval System - Experiments in Automatic Document Processing (1971), 313--323.
[34]
Mark Sanderson. 2008. Ambiguous Queries: Test Collections Need More Sense. In Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (Singapore, Singapore) (SIGIR '08). Association for Computing Machinery, New York, NY, USA, 499--506. https://doi.org/10.1145/1390334.1390420
[35]
Rodrygo L. T. Santos, Craig MacDonald, and Iadh Ounis. 2012. Learning to rank query suggestions for adhoc and diversity search. Information Retrieval 16 (2012), 429--451.
[36]
Ashwin K. Vijayakumar, Michael Cogswell, Ramprasaath R. Selvaraju, Qing Sun, Stefan Lee, David J. Crandall, and Dhruv Batra. 2016. Diverse Beam Search: Decoding Diverse Solutions from Neural Sequence Models. CoRR abs/1610.02424 (2016). arXiv:1610.02424 http://arxiv.org/abs/1610.02424
[37]
Jun Wang and Jianhan Zhu. 2009. Portfolio Theory of Information Retrieval. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval (Boston, MA, USA) (SIGIR '09). Association for Computing Machinery, New York, NY, USA, 115--122. https://doi.org/10.1145/1571941.1571963
[38]
Bin Wu, Chenyan Xiong, Maosong Sun, and Zhiyuan Liu. 2018. Query Suggestion with Feedback Memory Network. Proceedings of the 2018 World Wide Web Conference (2018).
[39]
Biao Xiang, Daxin Jiang, Jian Pei, Xiaohui Sun, Enhong Chen, and Hang Li. 2010. Context-aware ranking in web search. Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval (2010).
[40]
Jiacheng Ye, Tao Gui, Yichao Luo, Yige Xu, and Qi Zhang. 2021. One2Set: Generating Diverse Keyphrases as a Set. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 4598--4608. https://doi.org/10.18653/v1/2021.acllong. 354.
[41]
Hamed Zamani, Susan Dumais, Nick Craswell, Paul Bennett, and Gord Lueck. 2020. Generating Clarifying Questions for Information Retrieval. Association for Computing Machinery, New York, NY, USA, 418--428. https://doi.org/10.1145/3366423.3380126
[42]
Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N. Bennett, Nick Craswell, and Susan T. Dumais. 2020. Analyzing and Learning from User Interactions for Search Clarification. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25--30, 2020, Jimmy Huang, Yi Chang, Xueqi Cheng, Jaap Kamps, Vanessa Murdock, Ji-Rong Wen, and Yiqun Liu (Eds.). ACM, 1181--1190. https://doi.org/10.1145/3397271.3401160
[43]
Hamed Zamani, Bhaskar Mitra, Everest Chen, Gord Lueck, Fernando Diaz, Paul N. Bennett, Nick Craswell, and Susan T. Dumais. 2020. Analyzing and Learning from User Interactions for Search Clarification. CoRR abs/2006.00166 (2020). arXiv:2006.00166 https://arxiv.org/abs/2006.00166
[44]
Ingrid Zukerman and Bhavani Raskutti. 2002. Lexical Query Paraphrasing for Document Retrieval. In COLING 2002: The 19th International Conference on Computational Linguistics. https://aclanthology.org/C02--1161.

Cited By

View all
  • (2023)Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulators to Enhance Dialogue SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615220(3953-3957)Online publication date: 21-Oct-2023
  • (2023)RePair: An Extensible Toolkit to Generate Large-Scale Datasets for Query Refinement via TransformersProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615129(5376-5380)Online publication date: 21-Oct-2023
  • (2022)The SimIIR 2.0 FrameworkProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557711(4661-4666)Online publication date: 17-Oct-2022

Index Terms

  1. Interactive Query Clarification and Refinement via User Simulation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
    July 2022
    3569 pages
    ISBN:9781450387323
    DOI:10.1145/3477495
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 July 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. interactive information retrieval
    2. query clarification
    3. simulation

    Qualifiers

    • Short-paper

    Funding Sources

    • ANR JCJC project SESAMS

    Conference

    SIGIR '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)54
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 09 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Unlocking the Potential of User Feedback: Leveraging Large Language Model as User Simulators to Enhance Dialogue SystemProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615220(3953-3957)Online publication date: 21-Oct-2023
    • (2023)RePair: An Extensible Toolkit to Generate Large-Scale Datasets for Query Refinement via TransformersProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615129(5376-5380)Online publication date: 21-Oct-2023
    • (2022)The SimIIR 2.0 FrameworkProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557711(4661-4666)Online publication date: 17-Oct-2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media