research-article

Using historical click data to increase interleaving sensitivity

Authors:

Eugene Kharitonov,

Craig Macdonald,

Pavel Serdyukov,

Iadh OunisAuthors Info & Claims

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

Pages 679 - 688

https://doi.org/10.1145/2505515.2505687

Published: 27 October 2013 Publication History

Abstract

Interleaving is an online evaluation method to compare two alternative ranking functions based on the users' implicit feedback. In an interleaving experiment, the results from two ranking functions are merged in a single result list and presented to the users. The users' click feedback on the merged result list is analysed to derive preferences over the ranking functions. An important property of interleaving methods is their sensitivity, i.e. their ability to reliably derive the comparison outcome with a relatively small amount of user behaviour data. This allows testing of changes in the search engine ranking functions frequently and, as a result, rapid iterations in developing search quality improvements can be achieved. In this paper we propose a novel approach to further improve interleaving sensitivity by using pre-experimental user behaviour data. In particular, the click history is used to train a click model, which is then used to predict which interleaved result pages are likely to contribute to the experiment outcome. The probabilities of presenting these interleaved result pages to the users are then optimised, such that the sensitivity of interleaving is maximised. In order to evaluate the proposed approach, we re-use data from six actual interleaving experiments, previously performed by a commercial search engine. Our results demonstrate that the proposed approach outperforms a state-of-the-art baseline, achieving up to a median of 48% reduction in the number of impressions for the same level of confidence.

References

[1]

P. N. Bennett, F. Radlinski, R. W. White, and E. Yilmaz. Inferring and using location metadata to personalize web search. SIGIR '11.

Digital Library

[2]

C. M. Bishop. Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag New York, Inc., Secaucus, NJ, USA, 2006.

Digital Library

[3]

O. Chapelle, T. Joachims, F. Radlinski, and Y. Yue. Large-scale validation and analysis of interleaved search evaluation. Transactions on Information Systems, 30(1), 2012.

Digital Library

[4]

O. Chapelle and Y. Zhang. A dynamic bayesian network click model for web search ranking. WWW '09.

Digital Library

[5]

N. Craswell, O. Zoeter, M. Taylor, and B. Ramsey. An experimental comparison of click position-bias models. WSDM '08.

Digital Library

[6]

L. A. Granka, T. Joachims, and G. Gay. Eye-tracking analysis of user behavior in www search. SIGIR '04.

Digital Library

[7]

F. Guo, C. Liu, A. Kannan, T. Minka, M. Taylor, Y.-M. Wang, and C. Faloutsos. Click chain model in web search. WWW '09.

Digital Library

[8]

K. Hofmann, S. Whiteson, and M. de Rijke. Estimating interleaved comparison outcomes from historical click data. CIKM '12.

Digital Library

[9]

K. Hofmann, S. Whiteson, and M. de Rijke. A probabilistic method for inferring preferences from clicks. CIKM '11.

Digital Library

[10]

T. Joachims. Optimizing search engines using clickthrough data. KDD '02.

Digital Library

[11]

T. Joachims. Evaluating retrieval performance using clickthrough data. In J. Franke, G. Nakhaeizadeh, and I. Renz, editors, Text Mining. Physica/Springer Verlag, 2003.

[12]

F. Radlinski and N. Craswell. Comparing the sensitivity of information retrieval metrics. SIGIR '10.

Digital Library

[13]

F. Radlinski and N. Craswell. Optimized interleaving for online retrieval evaluation. WSDM '13.

Digital Library

[14]

F. Radlinski, M. Kurup, and T. Joachims. How does clickthrough data reflect retrieval quality? CIKM '08.

Digital Library

[15]

M. Richardson, E. Dominowska, and R. Ragno. Predicting clicks: estimating the click-through rate for new ads. WWW '07.

Digital Library

[16]

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998.

Digital Library

[17]

D. Tang, A. Agarwal, D. O'Brien, and M. Meyer. Overlapping experiment infrastructure: more, better, faster experimentation. KDD '10.

Digital Library

[18]

E. M. Voorhees. The philosophy of information retrieval evaluation. In Evaluation of Cross-Language Information Retrieval Systems, LNCS, pages 355--370. 2002.

Digital Library

[19]

Y. Yue, Y. Gao, O. Chapelle, Y. Zhang, and T. Joachims. Learning more powerful test statistics for click-based retrieval evaluation. SIGIR '10.

Digital Library

Cited By

Iizuka KMorita HKato M(2023)Theoretical Analysis on the Efficiency of Interleaved ComparisonsAdvances in Information Retrieval10.1007/978-3-031-28244-7_29(459-473)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28244-7_29
Benedetti ARuggero A(2023)Stat-Weight: Improving the Estimator of Interleaved Methods Outcomes with Statistical Hypothesis TestingAdvances in Information Retrieval10.1007/978-3-031-28241-6_2(20-34)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28241-6_2
Bi NCastells PGilbert DGalperin STardif PAhuja SAl Hasan MXiong L(2022)Debiased Balanced Interleaving at Amazon SearchProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557123(2913-2922)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557123
Show More Cited By

Index Terms

Using historical click data to increase interleaving sensitivity
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
      1. Relevance assessment

Recommendations

Generalized Team Draft Interleaving
CIKM '15: Proceedings of the 24th ACM International on Conference on Information and Knowledge Management

Interleaving is an online evaluation method that compares two ranking functions by mixing their results and interpreting the users' click feedback. An important property of an interleaving method is its sensitivity, i.e. the ability to obtain reliable ...
Large-scale validation and analysis of interleaved search evaluation

Interleaving is an increasingly popular technique for evaluating information retrieval systems based on implicit user feedback. While a number of isolated studies have analyzed how this technique agrees with conventional offline evaluation approaches ...
A Comparative Analysis of Interleaving Methods for Aggregated Search

A result page of a modern search engine often goes beyond a simple list of “10 blue links.” Many specific user needs (e.g., News, Image, Video) are addressed by so-called aggregated or vertical search solutions: specially presented documents, often ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '13: Proceedings of the 22nd ACM international conference on Information & Knowledge Management

October 2013

2612 pages

ISBN:9781450322638

DOI:10.1145/2505515

General Chairs:
Qi He
LinkedIn, USA
,
Arun Iyengar
IBM T.J. Watson Research Center, USA
,
Program Chairs:
Wolfgang Nejdl
L3S Research Center, Germany
,
Jian Pei
Simon Fraser University, Canada
,
Rajeev Rastogi
Amazon, India

Copyright © 2013 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM'13

Sponsor:

CIKM'13: 22nd ACM International Conference on Information and Knowledge Management

October 27 - November 1, 2013

California, San Francisco, USA

Acceptance Rates

CIKM '13 Paper Acceptance Rate 143 of 848 submissions, 17%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
169
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Iizuka KMorita HKato M(2023)Theoretical Analysis on the Efficiency of Interleaved ComparisonsAdvances in Information Retrieval10.1007/978-3-031-28244-7_29(459-473)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28244-7_29
Benedetti ARuggero A(2023)Stat-Weight: Improving the Estimator of Interleaved Methods Outcomes with Statistical Hypothesis TestingAdvances in Information Retrieval10.1007/978-3-031-28241-6_2(20-34)Online publication date: 2-Apr-2023
https://dl.acm.org/doi/10.1007/978-3-031-28241-6_2
Bi NCastells PGilbert DGalperin STardif PAhuja SAl Hasan MXiong L(2022)Debiased Balanced Interleaving at Amazon SearchProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557123(2913-2922)Online publication date: 17-Oct-2022
https://dl.acm.org/doi/10.1145/3511808.3557123
Hofmann KLi LRadlinski F(2016)Online Evaluation for Information RetrievalFoundations and Trends in Information Retrieval10.1561/150000005110:1(1-117)Online publication date: 1-Jun-2016
https://dl.acm.org/doi/10.1561/1500000051
Kharitonov EMacdonald CSerdyukov POunis IBailey JMoffat AAggarwal Cde Rijke MKumar RMurdock VSellis TYu J(2015)Generalized Team Draft InterleavingProceedings of the 24th ACM International on Conference on Information and Knowledge Management10.1145/2806416.2806477(773-782)Online publication date: 17-Oct-2015
https://dl.acm.org/doi/10.1145/2806416.2806477
Kharitonov EVorobev AMacdonald CSerdyukov POunis IBaeza-Yates RLalmas MMoffat ARibeiro-Neto B(2015)Sequential Testing for Early Stopping of Online ExperimentsProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767729(473-482)Online publication date: 9-Aug-2015
https://dl.acm.org/doi/10.1145/2766462.2767729
Kharitonov EMacdonald CSerdyukov POunis IBaeza-Yates RLalmas MMoffat ARibeiro-Neto B(2015)Optimised Scheduling of Online ExperimentsProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767706(453-462)Online publication date: 9-Aug-2015
https://dl.acm.org/doi/10.1145/2766462.2767706
Kharitonov EGeva STrotman ABruza PClarke CJärvelin K(2014)Improving offline and online web search evaluation by modelling the user behaviourProceedings of the 37th international ACM SIGIR conference on Research & development in information retrieval10.1145/2600428.2610379(1278-1278)Online publication date: 3-Jul-2014
https://dl.acm.org/doi/10.1145/2600428.2610379

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents