research-article

On the informativeness of cascade and intent-aware effectiveness measures

Authors:

Charles L.A. ClarkeAuthors Info & Claims

WWW '11: Proceedings of the 20th international conference on World wide web

Pages 407 - 416

https://doi.org/10.1145/1963405.1963464

Published: 28 March 2011 Publication History

Abstract

The Maximum Entropy Method provides one technique for validating search engine effectiveness measures. Under this method, the value of an effectiveness measure is used as a constraint to estimate the most likely distribution of relevant documents under a maximum entropy assumption. This inferred distribution may then be compared to the actual distribution to quantify the "informativeness" of the measure. The inferred distribution may also be used to estimate values for other effectiveness measures. Previous work focused on traditional effectiveness measures, such as average precision. In this paper, we extend the Maximum Entropy Method to the newer cascade and intent-aware effectiveness measures by considering the dependency of the documents ranked in a results list. These measures are intended to reflect the novelty and diversity of search results in addition to the traditional relevance. Our results indicate that intent-aware measures based on the cascade model are informative in terms of both inferring actual distribution and predicting the values of other retrieval measures.

References

[1]

R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In 2nd ACM International Conference on Web Search and Data Mining, pages 5--14, 2009.

Digital Library

[2]

J. Aslam, E. Yilmaz, and V. Pavlu. The maximum entropy method for analyzing retrieval measures. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 27--34, 2005.

Digital Library

[3]

C. Buckley and E. Voorhees. Evaluating evaluation measure stability. In 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 33--40, 2000.

Digital Library

[4]

O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In 18th ACM Conference on Information and Knowledge Management, pages 621--630, 2009.

Digital Library

[5]

C. L. A. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web track. In 18th Text REtrieval Conference, 2009.

[6]

C. L. A. Clarke, N. Craswell, I. Soboroff, and A. Ashkan. A comparative analysis of cascade measures for novelty and diversity. In ACM International Conference on Web Search and Data Mining, 2011.

Digital Library

[7]

C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkann, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 659--666, 2008.

Digital Library

[8]

C. L. A. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In 2nd International Conference on the Theory of Information Retrieval, pages 188--199, 2009.

Digital Library

[9]

T. Cover and J. Thomas. Elements of information theory. Wiley-Interscience, 2006.

Digital Library

[10]

N. Craswell, O. Zoeter, M. J. Taylor, and B. Ramsey. An experimental comparison of click position-bias models. In International Conference on Web Search and Web Data Mining, pages 87--94, 2008.

Digital Library

[11]

K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002.

Digital Library

[12]

A. Moffat and J. Zobel. Rank-biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems, 27(1):1--27, 2008.

Digital Library

[13]

M. Powell. A fast algorithm for nonlinearly constrained optimization calculations. Numerical Analysis, pages 144--157, 1978.

[14]

F. Radlinski, M. Szummer, and N. Craswell. Metrics for assessing sets of subtopics. In 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 853--854, 2010.

Digital Library

[15]

S. Robertson, E. Kanoulas, and E. Yilmaz. Extending average precision to graded relevance judgments. In 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 603--610, 2010.

Digital Library

[16]

T. Sakai. Evaluating evaluation metrics based on the bootstrap. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, page 532, 2006.

Digital Library

[17]

T. Sakai. On the robustness of information retrieval metrics to biased relevance assessments. Information and Media Technologies, 4(2):547--557, 2009.

[18]

M. Sanderson and J. Zobel. Information retrieval system evaluation: effort, sensitivity, and reliability. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 162--169, 2005.

Digital Library

[19]

E. Yilmaz and S. Robertson. On the choice of effectiveness measures for learning to rank. Information Retrieval, 13(3):271--290, 2010.

Digital Library

[20]

E. Yilmaz, M. Shokouhi, N. Craswell, and S. Robertson. Incorporating user behavior information in IR evaluation. In SIGIR 2009 Workshop on Understanding the User: Logging and Interpreting User Interactions in Information Retrieval, 2009.

[21]

Y. Zhang, L. Park, and A. Moffat. Click-based evidence for decaying weight distributions in search effectiveness metrics. Information Retrieval, 13(1):46--69, 2010.

Digital Library

Cited By

Wicaksono AMoffat ACuzzocrea AAllan JPaton NSrivastava DAgrawal RBroder AZaki MCandan SLabrinidis ASchuster AWang H(2018)Empirical Evidence for Search Effectiveness ModelsProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3269242(1571-1574)Online publication date: 17-Oct-2018
https://dl.acm.org/doi/10.1145/3269206.3269242
Moffat AWicaksono ACollins-Thompson KMei QDavison BLiu YYilmaz E(2018)Users, Adaptivity, and Bad AbandonmentThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210075(897-900)Online publication date: 27-Jun-2018
https://dl.acm.org/doi/10.1145/3209978.3210075
Moffat A(2018)Computing Maximized Effectiveness Distance for Recall-Based MetricsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.275437130:1(198-203)Online publication date: 1-Jan-2018
https://doi.org/10.1109/TKDE.2017.2754371
Show More Cited By

Recommendations

Evaluating Search Result Diversity using Intent Hierarchies
SIGIR '16: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval

Search result diversification aims at returning diversified document lists to cover different user intents for ambiguous or broad queries. Existing diversity measures assume that user intents are independent or exclusive, and do not consider the ...
A comparative analysis of cascade measures for novelty and diversity
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining

Traditional editorial effectiveness measures, such as nDCG, remain standard for Web search evaluation. Unfortunately, these traditional measures can inappropriately reward redundant information and can fail to reflect the broad range of user needs that ...
A Family of Rank Similarity Measures Based on Maximized Effectiveness Difference
Rank similarity measures provide a method for quantifying differences between search engine results without the need for relevance judgments. For example, the providers of a search service might use such measures to estimate the impact of a proposed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '11: Proceedings of the 20th international conference on World wide web

March 2011

840 pages

ISBN:9781450306324

DOI:10.1145/1963405

General Chairs:
S. Sadagopan
IIIT-Bangalore, India
,
Krithi Ramamritham
IIT-Bombay, India
,
Arun Kumar
IBM Research, India
,
M. P. Ravindra
Infosys E & R, India
,
Program Chairs:
Elisa Bertino
Purdue University, USA
,
Ravi Kumar
Yahoo! Research, USA

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
The International Institute of Information Technology Bangalore: The International Institute of Information Technology Bangalore

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 March 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '11

WWW '11: 20th International World Wide Web Conference

March 28 - April 1, 2011

Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
264
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)1

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Wicaksono AMoffat ACuzzocrea AAllan JPaton NSrivastava DAgrawal RBroder AZaki MCandan SLabrinidis ASchuster AWang H(2018)Empirical Evidence for Search Effectiveness ModelsProceedings of the 27th ACM International Conference on Information and Knowledge Management10.1145/3269206.3269242(1571-1574)Online publication date: 17-Oct-2018
https://dl.acm.org/doi/10.1145/3269206.3269242
Moffat AWicaksono ACollins-Thompson KMei QDavison BLiu YYilmaz E(2018)Users, Adaptivity, and Bad AbandonmentThe 41st International ACM SIGIR Conference on Research & Development in Information Retrieval10.1145/3209978.3210075(897-900)Online publication date: 27-Jun-2018
https://dl.acm.org/doi/10.1145/3209978.3210075
Moffat A(2018)Computing Maximized Effectiveness Distance for Recall-Based MetricsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2017.275437130:1(198-203)Online publication date: 1-Jan-2018
https://doi.org/10.1109/TKDE.2017.2754371
Kanoulas E(2016)A Short Survey on Online and Offline Methods for Search Quality EvaluationInformation Retrieval10.1007/978-3-319-41718-9_3(38-87)Online publication date: 26-Jul-2016
https://doi.org/10.1007/978-3-319-41718-9_3
Santos RMacdonald COunis I(2015)Search Result DiversificationFoundations and Trends in Information Retrieval10.1561/15000000409:1(1-90)Online publication date: 1-Mar-2015
https://dl.acm.org/doi/10.1561/1500000040
Sakai T(2014)Metrics, Statistics, TestsBridging Between Information Retrieval and Databases10.1007/978-3-642-54798-0_6(116-163)Online publication date: 2014
https://doi.org/10.1007/978-3-642-54798-0_6
Yi JChen YLi JSett SYan TGrossman RUthurusamy RDhillon IKoren Y(2013)Predictive model performanceProceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/2487575.2488215(1294-1302)Online publication date: 11-Aug-2013
https://dl.acm.org/doi/10.1145/2487575.2488215
Aktolga EAllan JJones GSheridan PKelly Dde Rijke MSakai T(2013)Sentiment diversification with different biasesProceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval10.1145/2484028.2484060(593-602)Online publication date: 28-Jul-2013
https://dl.acm.org/doi/10.1145/2484028.2484060
Moffat AScholer FThomas PTrotman ACunningham SSitbon L(2012)Models and metricsProceedings of the Seventeenth Australasian Document Computing Symposium10.1145/2407085.2407092(47-54)Online publication date: 5-Dec-2012
https://dl.acm.org/doi/10.1145/2407085.2407092
Pavlu VRajput SGolbus PAslam JAdar ETeevan JAgichtein EMaarek Y(2012)IR system evaluation using nugget-based test collectionsProceedings of the fifth ACM international conference on Web search and data mining10.1145/2124295.2124343(393-402)Online publication date: 8-Feb-2012
https://dl.acm.org/doi/10.1145/2124295.2124343

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents