research-article

Open access

Coherence-based Query Performance Measures for Dense Retrieval

Authors:

Craig MacdonaldAuthors Info & Claims

ICTIR '24: Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval

Pages 15 - 24

https://doi.org/10.1145/3664190.3672518

Published: 05 August 2024 Publication History

Abstract

Query Performance Prediction (QPP) estimates the effectiveness of a search engine's results in response to a query without relevance judgments. Traditionally, post-retrieval predictors have focused upon either the distribution of the retrieval scores, or the coherence of the top-ranked documents using traditional bag-of-words index representations. More recently, BERT-based models using dense embedded document representations have been used to create new predictors, but mostly applied to predict the performance of rankings created by BM25. Instead, we aim to predict the effectiveness of rankings created by single-representation dense retrieval models (ANCE & TCT-ColBERT). Therefore, we propose a number of variants of existing unsupervised coherence-based predictors that employ neural embedding representations. In our experiments on the TREC Deep Learning Track datasets, we demonstrate improved accuracy upon dense retrieval (up to 92% compared to sparse variants for TCT-ColBERT and 188% for ANCE). Going deeper, we select the most representative and best performing predictors to study the importance of differences among predictors and query types on query performance. Using the scaled Absolute Rank Error (sARE) evaluation measure and a particular type of linear mixed model, we find that query types further significantly influence query performance (and are up to 35% responsible for the unstable performance of QPP predictors), and that this sensitivity is unique to dense retrieval models. In particular, we find that in the cases where our predictors perform lower than score-based predictors, this is partially due to the sensitivity of MAP@100 to query types. Our novel analysis provides new insights into dense QPP that can explain potential unstable performance of existing predictors and outlines the unique characteristics of different query types on dense retrieval models.

References

[1]

Negar Arabzadeh, Amin Bigdeli, Morteza Zihayat, and Ebrahim Bagheri. 2021. Query Performance Prediction Through Retrieval Coherency. In Proc. ECIR.

Digital Library

[2]

Negar Arabzadeh, Maryam Khodabakhsh, and Ebrahim Bagheri. 2021. BERT-QPP: Contextualized pre-trained transformers for query performance prediction. In Proc. CIKM.

Digital Library

[3]

Negar Arabzadeh, Fattane Zarrinkalam, Jelena Jovanovic, Feras Al-Obeidat, and Ebrahim Bagheri. 2020. Neural embedding-based specificity metrics for pre-retrieval query performance prediction. Information Processing & Management, Vol. 57, 4 (2020), 102248.

[4]

Douglas Bates, Martin Maechler, Ben Bolker, Steven Walker, Rune Haubo Bojesen Christensen, Henrik Singmann, Bin Dai, Fabian Scheipl, and Gabor Grothendieck. 2009. Package 'lme4'. (2009).

[5]

Nicholas J. Belkin, Colleen Cool, Diane Kelly, S-J Lin, SY Park, Jose Perez-Carballo, and Cynthia Sikora. 2001. Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval. Information Processing & Management, Vol. 37, 3 (2001), 403--434.

Digital Library

[6]

Valeriia Bolotova, Vladislav Blinov, Falk Scholer, W Bruce Croft, and Mark Sanderson. 2022. A Non-Factoid Question-Answering Taxonomy. In Proc. SIGIR.

Digital Library

[7]

David Carmel and Elad Yom-Tov. 2010. Estimating the query difficulty for information retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services, Vol. 2, 1 (2010), 1--89.

Digital Library

[8]

David Carmel, Elad Yom-Tov, Adam Darlow, and Dan Pelleg. 2006. What makes a query difficult?. In Proc. SIGIR.

Digital Library

[9]

Vassilis Christophides, Vasilis Efthymiou, and Kostas Stefanidis. 2015. Entity resolution in the web of data. Synthesis Lectures on the Semantic Web, Vol. 5, 3 (2015), 1--122.

[10]

Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M Voorhees. 2020. Overview of the TREC 2019 deep learning track. arXiv preprint arXiv:2003.07820 (2020).

[11]

Steve Cronen-Townsend, Yun Zhou, and W Bruce Croft. 2002. Predicting query performance. In Proc. SIGIR.

Digital Library

[12]

Ronan Cummins, Joemon Jose, and Colm O'Riordan. 2011. Improved query performance prediction using standard deviation. In Proc. SIGIR.

Digital Library

[13]

Patrick J Curran, Eric Stice, and Laurie Chassin. 1997. The relation between adolescent alcohol use and peer alcohol use: a longitudinal random coefficients model. Journal of consulting and clinical psychology, Vol. 65, 1 (1997), 130.

[14]

Suchana Datta, Sean MacAvaney, Debasis Ganguly, and Derek Greene. 2022. A'Pointwise-Query, Listwise-Document'based Query Performance Prediction Approach. In Proc. SIGIR.

[15]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[16]

Fernando Diaz. 2007. Performance prediction using spatial autocorrelation. In Proc. SIGIR.

Digital Library

[17]

Guglielmo Faggioli, Nicola Ferro, Cristina Ioana Muntean, Raffaele Perego, and Nicola Tonellotto. 2023. A Geometric Framework for Query Performance Prediction in Conversational Search. In Proc. SIGIR.

Digital Library

[18]

Guglielmo Faggioli, Thibault Formal, Stefano Marchesin, Stéphane Clinchant, Nicola Ferro, and Benjamin Piwowarski. 2023. Query Performance Prediction for Neural IR: Are We There Yet?. In Proc. ECIR.

Digital Library

[19]

Guglielmo Faggioli, Stefano Marchesin, et al. 2021. What makes a query semantically hard?. In Proc. DESIRES.

[20]

Guglielmo Faggioli, Oleg Zendel, J Shane Culpepper, Nicola Ferro, and Falk Scholer. 2021. An enhanced evaluation framework for query performance prediction. In Proc. ECIR.

Digital Library

[21]

Zoë Field, Jeremy Miles, and Andy Field. 2012. Discovering statistics using R. Discovering Statistics Using R (2012), 1--992.

[22]

Donna Harman and Chris Buckley. 2004. The NRRC reliable information access (RIA) workshop. In Proc. SIGIR.

Digital Library

[23]

Helia Hashemi, Hamed Zamani, and W Bruce Croft. 2019. Performance prediction for non-factoid question answering. In Proc. ICTIR.

Digital Library

[24]

Claudia Hauff, Djoerd Hiemstra, and Franciska de Jong. 2008. A survey of pre-retrieval query performance predictors. In Proc. CIKM.

Digital Library

[25]

Ben He and Iadh Ounis. 2004. Inferring query performance using pre-retrieval predictors. In Proc. SPIRE.

[26]

Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proc. EMNLP.

[27]

Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Rassage Search via Contextualized Late Interaction over BERT. In Proc. SIGIR.

Digital Library

[28]

Sheng-Chieh Lin, Jheng-Hong Yang, and Jimmy Lin. 2020. Distilling dense representations for ranking using tightly-coupled teachers. arXiv preprint arXiv:2010.11386 (2020).

[29]

Christina Lioma and Iadh Ounis. 2008. A syntactically-based query reformulation technique for information retrieval. Information processing & management, Vol. 44, 1 (2008), 143--162.

[30]

Craig Macdonald, Nicola Tonellotto, Sean MacAvaney, and Iadh Ounis. 2021. PyTerrier: Declarative experimentation in Python from BM25 to dense retrieval. In Proc. CIKM. 4526--4533.

Digital Library

[31]

Henrik Madsen and Poul Thyregod. 2010. Introduction to general and generalized linear models. CRC Press.

[32]

Scott E Maxwell, Harold D Delaney, and Ken Kelley. 2017. Designing experiments and analyzing data: A model comparison perspective. Routledge.

[33]

Mandar Mitra, Amit Singhal, and Chris Buckley. 1998. Improving automatic query expansion. In Proc. SIGIR.

Digital Library

[34]

Josiane Mothe and Ludovic Tanguy. 2005. Linguistic features to predict query difficulty. In ACM Conference on research and Development in Information Retrieval, SIGIR, Predicting query difficulty-methods and applications workshop. 7--10.

[35]

John Ashworth Nelder and Robert WM Wedderburn. 1972. Generalized linear models. Journal of the Royal Statistical Society: Series A (General), Vol. 135, 3 (1972), 370--384.

[36]

Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).

[37]

Rodrigo Nogueira, Zhiying Jiang, and Jimmy Lin. 2020. Document ranking with a pretrained sequence-to-sequence model. arXiv preprint arXiv:2003.06713 (2020).

[38]

I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. 2006. Terrier: A High Performance and Scalable Information Retrieval Platform. In Proceedings of ACM SIGIR'06 Workshop on Open Source Information Retrieval (OSIR 2006).

[39]

Joaquín Pérez-Iglesias and Lourdes Araujo. 2010. Standard deviation as a query hardness estimator. In Proc. SPIRE. Springer, 207--212.

[40]

Soo Young Rieh and Hong Xie. 2006. Analysis of multiple query reformulations on the web: The interactive information retrieval context. Information Processing & Management, Vol. 42, 3 (2006), 751--768.

Digital Library

[41]

Stephen E Robertson and Steve Walker. 1994. Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In Proc. SIGIR. Springer.

[42]

Haggai Roitman, Shai Erera, Oren Sar-Shalom, and Bar Weiner. 2017. Enhanced mean retrieval score estimation for query performance prediction. In Proc. ICTIR. 35--42.

Digital Library

[43]

Haggai Roitman, Shai Erera, and Bar Weiner. 2017. Robust standard deviation estimation for query performance prediction. In Proc. ICTIR.

Digital Library

[44]

Dwaipayan Roy, Debasis Ganguly, Mandar Mitra, and Gareth JF Jones. 2019. Estimating gaussian mixture models in the local neighbourhood of embedded word vectors for query performance prediction. Information processing & management, Vol. 56, 3 (2019), 1026--1045.

[45]

Abbas Saleminezhad, Negar Arabzadeh, Soosan Beheshti, and Ebrahim Bagheri. 2024. Context-Aware Query Term Difficulty Estimation for Performance Prediction. In Proc. ECIR.

Digital Library

[46]

Falk Scholer and Steven Garcia. 2009. A case for improved evaluation of query difficulty prediction. In Proc. SIGIR.

Digital Library

[47]

Anna Shtok, Oren Kurland, and David Carmel. 2009. Predicting query performance by query-drift estimation. In Proc. ICTIR.

Digital Library

[48]

Anna Shtok, Oren Kurland, and David Carmel. 2010. Using statistical decision theory and relevance models for query-performance prediction. In Proc. SIGIR.

Digital Library

[49]

Anna Shtok, Oren Kurland, David Carmel, Fiana Raiber, and Gad Markovits. 2012. Predicting query performance by query-drift estimation. ACM Transactions on Information Systems (TOIS), Vol. 30, 2 (2012), 1--35.

Digital Library

[50]

Judith D Singer and John B Willett. 2003. Applied longitudinal data analysis: Modeling change and event occurrence. Oxford university press.

[51]

Yongquan Tao and Shengli Wu. 2014. Query performance prediction by considering score magnitude and variance together. In Proc. CIKM.

Digital Library

[52]

R Core Team. 2021. R: A language and environment for statistical computing. Published online 2020.

[53]

Ellen M Voorhees. 2003. Overview of the TREC 2003 robust retrieval track. In Proc. TREC. 69--77.

[54]

Xiao Wang, Craig Macdonald, and Iadh Ounis. 2020. Deep reinforced query reformulation for information retrieval. arXiv preprint arXiv:2007.07987 (2020).

[55]

Xiao Wang, Craig Macdonald, Nicola Tonellotto, and Iadh Ounis. 2021. Pseudo-relevance feedback for multiple representation dense retrieval. In Proc. ICTIR.

Digital Library

[56]

Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020).

[57]

Hong Chien Yu, Chenyan Xiong, and Jamie Callan. 2021. Improving Query Representations for Dense Retrieval with Pseudo Relevance Feedback. In Proc. CIKM.

Digital Library

[58]

Hamed Zamani, W Bruce Croft, and J Shane Culpepper. 2018. Neural query performance prediction using weak supervision from multiple signals. In Proc. SIGIR.

Digital Library

[59]

Ying Zhao, Falk Scholer, and Yohannes Tsegay. 2008. Effective pre-retrieval query performance prediction using similarity and variability evidence. In Proc. ECIR.

[60]

Yun Zhou and W Bruce Croft. 2007. Query performance prediction in web search environments. In Proc. SIGIR.

Digital Library

Index Terms

Coherence-based Query Performance Measures for Dense Retrieval
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results

Recommendations

Predicting query performance in microblog retrieval
SIGIR '14: Proceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval

Query Performance Prediction (QPP) is the estimation of the retrieval success for a query, without explicit knowledge about relevant documents. QPP is especially interesting in the context of Automatic Query Expansion (AQE) based on Pseudo Relevance ...
Query performance prediction

The prediction of query performance is an interesting and important issue in Information Retrieval (IR). Current predictors involve the use of relevance scores, which are time-consuming to compute. Therefore, current predictors are not very suitable for ...
Time-based query performance predictors
SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Query performance prediction is aimed at predicting the retrieval effectiveness that a query will achieve with respect to a particular ranking model. In this paper, we study query performance prediction for a ranking model that explicitly incorporates ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICTIR '24: Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval

August 2024

267 pages

ISBN:9798400706813

DOI:10.1145/3664190

General Chair:
Harrie Oosterhuis
Radboud University
,
Program Chairs:
Hannah Bast
University of Freiburg
,
Chenyan Xiong
Carnegie Mellon University

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

UK Research and Innovation

Conference

ICTIR '24

Sponsor:

SIGIR

ICTIR '24: The 2024 ACM SIGIR International Conference on the Theory of Information Retrieval

July 13, 2024

Washington DC, USA

Acceptance Rates

ICTIR '24 Paper Acceptance Rate 26 of 45 submissions, 58%;

Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
175
Total Downloads

Downloads (Last 12 months)175
Downloads (Last 6 weeks)49

Reflects downloads up to 03 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents