skip to main content
10.1145/3664190.3672518acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article
Open access

Coherence-based Query Performance Measures for Dense Retrieval

Published: 05 August 2024 Publication History

Abstract

Query Performance Prediction (QPP) estimates the effectiveness of a search engine's results in response to a query without relevance judgments. Traditionally, post-retrieval predictors have focused upon either the distribution of the retrieval scores, or the coherence of the top-ranked documents using traditional bag-of-words index representations. More recently, BERT-based models using dense embedded document representations have been used to create new predictors, but mostly applied to predict the performance of rankings created by BM25. Instead, we aim to predict the effectiveness of rankings created by single-representation dense retrieval models (ANCE & TCT-ColBERT). Therefore, we propose a number of variants of existing unsupervised coherence-based predictors that employ neural embedding representations. In our experiments on the TREC Deep Learning Track datasets, we demonstrate improved accuracy upon dense retrieval (up to 92% compared to sparse variants for TCT-ColBERT and 188% for ANCE). Going deeper, we select the most representative and best performing predictors to study the importance of differences among predictors and query types on query performance. Using the scaled Absolute Rank Error (sARE) evaluation measure and a particular type of linear mixed model, we find that query types further significantly influence query performance (and are up to 35% responsible for the unstable performance of QPP predictors), and that this sensitivity is unique to dense retrieval models. In particular, we find that in the cases where our predictors perform lower than score-based predictors, this is partially due to the sensitivity of MAP@100 to query types. Our novel analysis provides new insights into dense QPP that can explain potential unstable performance of existing predictors and outlines the unique characteristics of different query types on dense retrieval models.

References

[1]
Negar Arabzadeh, Amin Bigdeli, Morteza Zihayat, and Ebrahim Bagheri. 2021. Query Performance Prediction Through Retrieval Coherency. In Proc. ECIR.
[2]
Negar Arabzadeh, Maryam Khodabakhsh, and Ebrahim Bagheri. 2021. BERT-QPP: Contextualized pre-trained transformers for query performance prediction. In Proc. CIKM.
[3]
Negar Arabzadeh, Fattane Zarrinkalam, Jelena Jovanovic, Feras Al-Obeidat, and Ebrahim Bagheri. 2020. Neural embedding-based specificity metrics for pre-retrieval query performance prediction. Information Processing & Management, Vol. 57, 4 (2020), 102248.
[4]
Douglas Bates, Martin Maechler, Ben Bolker, Steven Walker, Rune Haubo Bojesen Christensen, Henrik Singmann, Bin Dai, Fabian Scheipl, and Gabor Grothendieck. 2009. Package 'lme4'. (2009).
[5]
Nicholas J. Belkin, Colleen Cool, Diane Kelly, S-J Lin, SY Park, Jose Perez-Carballo, and Cynthia Sikora. 2001. Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing & Management, Vol. 37, 3 (2001), 403--434.
[6]
Valeriia Bolotova, Vladislav Blinov, Falk Scholer, W Bruce Croft, and Mark Sanderson. 2022. A Non-Factoid Question-Answering Taxonomy. In Proc. SIGIR.
[7]
David Carmel and Elad Yom-Tov. 2010. Estimating the query difficulty for information retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services, Vol. 2, 1 (2010), 1--89.
[8]
David Carmel, Elad Yom-Tov, Adam Darlow, and Dan Pelleg. 2006. What makes a query difficult?. In Proc. SIGIR.
[9]
Vassilis Christophides, Vasilis Efthymiou, and Kostas Stefanidis. 2015. Entity resolution in the web of data. Synthesis Lectures on the Semantic Web, Vol. 5, 3 (2015), 1--122.
[10]
Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M Voorhees. 2020. Overview of the TREC 2019 deep learning track. arXiv preprint arXiv:2003.07820 (2020).
[11]
Steve Cronen-Townsend, Yun Zhou, and W Bruce Croft. 2002. Predicting query performance. In Proc. SIGIR.
[12]
Ronan Cummins, Joemon Jose, and Colm O'Riordan. 2011. Improved query performance prediction using standard deviation. In Proc. SIGIR.
[13]
Patrick J Curran, Eric Stice, and Laurie Chassin. 1997. The relation between adolescent alcohol use and peer alcohol use: a longitudinal random coefficients model. Journal of consulting and clinical psychology, Vol. 65, 1 (1997), 130.
[14]
Suchana Datta, Sean MacAvaney, Debasis Ganguly, and Derek Greene. 2022. A'Pointwise-Query, Listwise-Document'based Query Performance Prediction Approach. In Proc. SIGIR.
[15]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[16]
Fernando Diaz. 2007. Performance prediction using spatial autocorrelation. In Proc. SIGIR.
[17]
Guglielmo Faggioli, Nicola Ferro, Cristina Ioana Muntean, Raffaele Perego, and Nicola Tonellotto. 2023. A Geometric Framework for Query Performance Prediction in Conversational Search. In Proc. SIGIR.
[18]
Guglielmo Faggioli, Thibault Formal, Stefano Marchesin, Stéphane Clinchant, Nicola Ferro, and Benjamin Piwowarski. 2023. Query Performance Prediction for Neural IR: Are We There Yet?. In Proc. ECIR.
[19]
Guglielmo Faggioli, Stefano Marchesin, et al. 2021. What makes a query semantically hard?. In Proc. DESIRES.
[20]
Guglielmo Faggioli, Oleg Zendel, J Shane Culpepper, Nicola Ferro, and Falk Scholer. 2021. An enhanced evaluation framework for query performance prediction. In Proc. ECIR.
[21]
Zoë Field, Jeremy Miles, and Andy Field. 2012. Discovering statistics using R. Discovering Statistics Using R (2012), 1--992.
[22]
Donna Harman and Chris Buckley. 2004. The NRRC reliable information access (RIA) workshop. In Proc. SIGIR.
[23]
Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2019. Performance prediction for non-factoid question answering. In Proc. ICTIR.
[24]
Claudia Hauff, Djoerd Hiemstra, and Franciska de Jong. 2008. A survey of pre-retrieval query performance predictors. In Proc. CIKM.
[25]
Ben He and Iadh Ounis. 2004. Inferring query performance using pre-retrieval predictors. In Proc. SPIRE.
[26]
Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proc. EMNLP.
[27]
Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Rassage Search via Contextualized Late Interaction over BERT. In Proc. SIGIR.
[28]
Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2020. Distilling dense representations for ranking using tightly-coupled teachers. arXiv preprint arXiv:2010.11386 (2020).
[29]
Christina Lioma and Iadh Ounis. 2008. A syntactically-based query reformulation technique for information retrieval. Information processing & management, Vol. 44, 1 (2008), 143--162.
[30]
Craig Macdonald, Nicola Tonellotto, Sean MacAvaney, and Iadh Ounis. 2021. PyTerrier: Declarative experimentation in Python from BM25 to dense retrieval. In Proc. CIKM. 4526--4533.
[31]
Henrik Madsen and Poul Thyregod. 2010. Introduction to general and generalized linear models. CRC Press.
[32]
Scott E Maxwell, Harold D Delaney, and Ken Kelley. 2017. Designing experiments and analyzing data: A model comparison perspective. Routledge.
[33]
Mandar Mitra, Amit Singhal, and Chris Buckley. 1998. Improving automatic query expansion. In Proc. SIGIR.
[34]
Josiane Mothe and Ludovic Tanguy. 2005. Linguistic features to predict query difficulty. In ACM Conference on research and Development in Information Retrieval, SIGIR, Predicting query difficulty-methods and applications workshop. 7--10.
[35]
John Ashworth Nelder and Robert WM Wedderburn. 1972. Generalized linear models. Journal of the Royal Statistical Society: Series A (General), Vol. 135, 3 (1972), 370--384.
[36]
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).
[37]
Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. 2020. Document ranking with a pretrained sequence-to-sequence model. arXiv preprint arXiv:2003.06713 (2020).
[38]
I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. 2006. Terrier: A High Performance and Scalable Information Retrieval Platform. In Proceedings of ACM SIGIR'06 Workshop on Open Source Information Retrieval (OSIR 2006).
[39]
Joaquín Pérez-Iglesias and Lourdes Araujo. 2010. Standard deviation as a query hardness estimator. In Proc. SPIRE. Springer, 207--212.
[40]
Soo Young Rieh and Hong Xie. 2006. Analysis of multiple query reformulations on the web: The interactive information retrieval context. Information Processing & Management, Vol. 42, 3 (2006), 751--768.
[41]
Stephen E Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proc. SIGIR. Springer.
[42]
Haggai Roitman, Shai Erera, Oren Sar-Shalom, and Bar Weiner. 2017. Enhanced mean retrieval score estimation for query performance prediction. In Proc. ICTIR. 35--42.
[43]
Haggai Roitman, Shai Erera, and Bar Weiner. 2017. Robust standard deviation estimation for query performance prediction. In Proc. ICTIR.
[44]
Dwaipayan Roy, Debasis Ganguly, Mandar Mitra, and Gareth JF Jones. 2019. Estimating gaussian mixture models in the local neighbourhood of embedded word vectors for query performance prediction. Information processing & management, Vol. 56, 3 (2019), 1026--1045.
[45]
Abbas Saleminezhad, Negar Arabzadeh, Soosan Beheshti, and Ebrahim Bagheri. 2024. Context-Aware Query Term Difficulty Estimation for Performance Prediction. In Proc. ECIR.
[46]
Falk Scholer and Steven Garcia. 2009. A case for improved evaluation of query difficulty prediction. In Proc. SIGIR.
[47]
Anna Shtok, Oren Kurland, and David Carmel. 2009. Predicting query performance by query-drift estimation. In Proc. ICTIR.
[48]
Anna Shtok, Oren Kurland, and David Carmel. 2010. Using statistical decision theory and relevance models for query-performance prediction. In Proc. SIGIR.
[49]
Anna Shtok, Oren Kurland, David Carmel, Fiana Raiber, and Gad Markovits. 2012. Predicting query performance by query-drift estimation. ACM Transactions on Information Systems (TOIS), Vol. 30, 2 (2012), 1--35.
[50]
Judith D Singer and John B Willett. 2003. Applied longitudinal data analysis: Modeling change and event occurrence. Oxford university press.
[51]
Yongquan Tao and Shengli Wu. 2014. Query performance prediction by considering score magnitude and variance together. In Proc. CIKM.
[52]
R Core Team. 2021. R: A language and environment for statistical computing. Published online 2020.
[53]
Ellen M Voorhees. 2003. Overview of the TREC 2003 robust retrieval track. In Proc. TREC. 69--77.
[54]
Xiao Wang, Craig Macdonald, and Iadh Ounis. 2020. Deep reinforced query reformulation for information retrieval. arXiv preprint arXiv:2007.07987 (2020).
[55]
Xiao Wang, Craig Macdonald, Nicola Tonellotto, and Iadh Ounis. 2021. Pseudo-relevance feedback for multiple representation dense retrieval. In Proc. ICTIR.
[56]
Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020).
[57]
Hong Chien Yu, Chenyan Xiong, and Jamie Callan. 2021. Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback. In Proc. CIKM.
[58]
Hamed Zamani, W Bruce Croft, and J Shane Culpepper. 2018. Neural query performance prediction using weak supervision from multiple signals. In Proc. SIGIR.
[59]
Ying Zhao, Falk Scholer, and Yohannes Tsegay. 2008. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Proc. ECIR.
[60]
Yun Zhou and W Bruce Croft. 2007. Query performance prediction in web search environments. In Proc. SIGIR.

Index Terms

  1. Coherence-based Query Performance Measures for Dense Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICTIR '24: Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval
    August 2024
    267 pages
    ISBN:9798400706813
    DOI:10.1145/3664190
    This work is licensed under a Creative Commons Attribution International 4.0 License.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 August 2024

    Check for updates

    Author Tags

    1. coherence-based
    2. dense retrieval
    3. query performance prediction

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ICTIR '24
    Sponsor:

    Acceptance Rates

    ICTIR '24 Paper Acceptance Rate 26 of 45 submissions, 58%;
    Overall Acceptance Rate 235 of 527 submissions, 45%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 175
      Total Downloads
    • Downloads (Last 12 months)175
    • Downloads (Last 6 weeks)49
    Reflects downloads up to 03 Jan 2025

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media