research-article

How does clickthrough data reflect retrieval quality?

Authors:

Filip Radlinski,

Thorsten JoachimsAuthors Info & Claims

CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management

Pages 43 - 52

https://doi.org/10.1145/1458082.1458092

Published: 26 October 2008 Publication History

Abstract

Automatically judging the quality of retrieval functions based on observable user behavior holds promise for making retrieval evaluation faster, cheaper, and more user centered. However, the relationship between observable user behavior and retrieval quality is not yet fully understood. We present a sequence of studies investigating this relationship for an operational search engine on the arXiv.org e-print archive. We find that none of the eight absolute usage metrics we explore (e.g., number of clicks, frequency of query reformulations, abandonment) reliably reflect retrieval quality for the sample sizes we consider. However, we find that paired experiment designs adapted from sensory analysis produce accurate and reliable statements about the relative quality of two retrieval functions. In particular, we investigate two paired comparison tests that analyze clickthrough data from an interleaved presentation of ranking pairs, and we find that both give accurate and consistent results. We conclude that both paired comparison tests give substantially more accurate and sensitive evaluation results than absolute usage metrics in our domain.

References

[1]

E. Agichtein, E. Brill, S. Dumais, and R. Ragno. Learning user interaction models for prediction web search results preferences. In Proc. of SIGIR 2006.

Digital Library

[2]

K. Ali and C. Chang. On the relationship between click-rate and relevance for search engines. In Proc. of Data-Mining and Information Engineering, 2006.

[3]

J.A. Aslam, V. Pavlu, and E. Yilmaz. A sampling technique for efficiently estimating measures of query retrieval performance using incomplete judgments. In ICML Workshop on Learning with Partial ly Classified Training Data, 2005.

[4]

J. Boyan, D. Freitag, and T. Joachims. A machine learning architecture for optimizing web search engines. In AAAI Workshop on Internet Based Information Systems, 1996.

[5]

C. Buckley and E.M. Voorhees. Retrieval evaluation with incomplete information. In Proc. of SIGIR 2004.

Digital Library

[6]

B. Carterette, J. Allan, and R. Sitaraman. Minimal test collections for retrieval evaluation. In Proc. of SIGIR 2006.

Digital Library

[7]

B. Carterette, P.N. Bennett, D.M. Chickering, and S.T. Dumais. Here or there: Preference judgements for relevance. In Proc. of ECIR 2008.

Digital Library

[8]

B. Carterette and R. Jones. Evaluating search engines by modeling the relationship between relevance and clicks. In Proc. of NIPS 2007.

[9]

G. Dupret, V. Murdock, and B. Piwowarski. Web search engine evaluation using clickthrough data and a user model. In WWW Workshop on Query Log Analysis, 2007.

[10]

S. Fox, K. Karnawat, M. Mydland, S. Dumais, and T. White. Evaluating implicit measures to improve web search. ACM Transactions on Information Science (TOIS), 23(2):147--168, April 2005.

Digital Library

[11]

S.B. Huffman and M. Hochster. How well does result relevance predict session satisfaction? In Proc. of SIGIR 2007.

Digital Library

[12]

T. Joachims. Optimizing search engines using clickthrough data. In Proc. of KDD 2002.

Digital Library

[13]

T. Joachims. Evaluating retrieval performance using clickthrough data. In J. Franke, G. Nakhaeizadeh, and I. Renz, editors, Text Mining. Physica Verlag, 2003.

[14]

T. Joachims, L. Granka, B. Pan, H. Hembrooke, F. Radlinski, and G. Gay. Evaluating the accuracy of implicit feedback from clicks and query reformulations in web search. ACM Transactions on Information Science (TOIS), 25 (2), 2007. Article 7.

Digital Library

[15]

D. Kelly and J. Teevan. Implicit feedback for inferring user preference: A bibliography. ACM SIGIR Forum, 37(2):18--28, 2003.

Digital Library

[16]

J. Kozielecki. Psychological Decision Theory. Kluwer, 1981.

[17]

D. Laming. Sensory Analysis. Academic Press, 1986.

[18]

Y. Liu, Y. Fu, M. Zhang, S. Ma, and L. Ru. Automatic search engine performance evaluation with click-through data analysis. In Proc. of WWW 2007.

Digital Library

[19]

C.D. Manning, P. Raghavan, and H. Schuetze. Introduction to Information Retrieval. Cambridge University Press, 2008.

Digital Library

[20]

J. Reid. A task-oriented non-interactive evaluation methodology for information retrieval systems. Information Retrieval, 2:115--129, 2000.

Digital Library

[21]

I. Soboroff, C. Nicholas, and P. Cahan. Ranking retrieval systems without relevance judgments. In Proc. of SIGIR 2001.

Digital Library

[22]

A. Turpin and F. Scholer. User performance versus precision measures for simple search tasks. In Proc. of SIGIR 2006.

Digital Library

[23]

E.M. Voorhees and D.K. Harman, editors. TREC: Experiment and Evaluation in Information Retrieval. MIT Press, 2005.

Digital Library

Cited By

Wu XPuthenputhussery AShang HKang CFang Y(2024)Meta Learning to Rank for Sparsely Supervised QueriesACM Transactions on Information Systems10.1145/3698876Online publication date: 8-Oct-2024
https://doi.org/10.1145/3698876
Leung J(2024)Improving Educators’ Search Engine Experience: A Quantitative Analysis of Search TermsIEEE Access10.1109/ACCESS.2024.339342312(69076-69086)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3393423
Xia YXie ZYu TZhao CLi S(2024)Toward joint utilization of absolute and relative bandit feedback for conversational recommendationUser Modeling and User-Adapted Interaction10.1007/s11257-023-09388-5Online publication date: 27-Jan-2024
https://doi.org/10.1007/s11257-023-09388-5
Show More Cited By

Index Terms

How does clickthrough data reflect retrieval quality?
1. Information systems
  1. Information retrieval

Recommendations

Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search

This article examines the reliability of implicit feedback generated from clickthrough data and query reformulations in World Wide Web (WWW) search. Analyzing the users' decision process using eyetracking and comparing implicit feedback against manual ...
A model to estimate intrinsic document relevance from the clickthrough logs of a web search engine
WSDM '10: Proceedings of the third ACM international conference on Web search and data mining

We propose a new model to interpret the clickthrough logs of a web search engine. This model is based on explicit assumptions on the user behavior. In particular, we draw conclusions on a document relevance by observing the user behavior after he ...
Improving retrieval accuracy by weighting document types with clickthrough data
SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval

For enterprise search, there exists a relationship between work task and document type that can be used to refine search results. In this poster, we adapt the popular Okapi BM25 scoring function to weight term frequency based on the relevance of a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '08: Proceedings of the 17th ACM conference on Information and knowledge management

October 2008

1562 pages

ISBN:9781595939913

DOI:10.1145/1458082

General Chair:
James G. Shanahan
Church and Duncan Group Inc, USA
,
Program Chairs:
Sihem Amer-Yahia
Yahoo! Research, USA
,
Ioana Manolescu
INRIA, France
,
Yi Zhang
University of California, Santa Cruz, USA
,
David A. Evans
JustSystems Evans Research, USA
,
Alek Kolcz
Microsoft Live Labs, USA
,
Key-Sun Choi
KAIST, Korea
,
Abdur Chowdury
Twitter, USA

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 October 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM08

Sponsor:

CIKM08: Conference on Information and Knowledge Management

October 26 - 30, 2008

California, Napa Valley, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

217
Total Citations
View Citations
1,160
Total Downloads

Downloads (Last 12 months)33
Downloads (Last 6 weeks)6

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wu XPuthenputhussery AShang HKang CFang Y(2024)Meta Learning to Rank for Sparsely Supervised QueriesACM Transactions on Information Systems10.1145/3698876Online publication date: 8-Oct-2024
https://doi.org/10.1145/3698876
Leung J(2024)Improving Educators’ Search Engine Experience: A Quantitative Analysis of Search TermsIEEE Access10.1109/ACCESS.2024.339342312(69076-69086)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3393423
Xia YXie ZYu TZhao CLi S(2024)Toward joint utilization of absolute and relative bandit feedback for conversational recommendationUser Modeling and User-Adapted Interaction10.1007/s11257-023-09388-5Online publication date: 27-Jan-2024
https://doi.org/10.1007/s11257-023-09388-5
Schultzberg COttens B(2024)Navigating the Evaluation Funnel to Optimize Iteration Speed for Recommender SystemsProceedings of the Future Technologies Conference (FTC) 2024, Volume 110.1007/978-3-031-73110-5_11(138-157)Online publication date: 5-Nov-2024
https://doi.org/10.1007/978-3-031-73110-5_11
Thijssen SKhandel PYates AVarbanescu A(2024)MassiveClicks: A Massively-Parallel Framework for Efficient Click Models TrainingEuro-Par 2023: Parallel Processing Workshops10.1007/978-3-031-50684-0_18(232-245)Online publication date: 16-Apr-2024
https://doi.org/10.1007/978-3-031-50684-0_18
Suk JAgarwal AOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)When can we track significant preference shifts in dueling bandits?Proceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667788(38347-38369)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667788
Pereira BChaves PSantos R(2023)Efficient Exploration and Exploitation for Sequential Music RecommendationACM Transactions on Recommender Systems10.1145/36258272:4(1-23)Online publication date: 27-Sep-2023
https://dl.acm.org/doi/10.1145/3625827
Breuer TFuhr NSchaer P(2023)Validating Synthetic Usage Data in Living Lab EnvironmentsJournal of Data and Information Quality10.1145/3623640Online publication date: 24-Sep-2023
https://dl.acm.org/doi/10.1145/3623640
Shahout RPeisakhovsky YStoikov SGarg N(2023)Interface Design to Mitigate Inflation in Recommender SystemsProceedings of the 17th ACM Conference on Recommender Systems10.1145/3604915.3608846(897-903)Online publication date: 14-Sep-2023
https://dl.acm.org/doi/10.1145/3604915.3608846
Oosterhuis H(2023)Doubly Robust Estimation for Correcting Position Bias in Click Feedback for Unbiased Learning to RankACM Transactions on Information Systems10.1145/356945341:3(1-33)Online publication date: 7-Feb-2023
https://dl.acm.org/doi/10.1145/3569453
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents