skip to main content
10.1145/1963405.1963464acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

On the informativeness of cascade and intent-aware effectiveness measures

Published: 28 March 2011 Publication History

Abstract

The Maximum Entropy Method provides one technique for validating search engine effectiveness measures. Under this method, the value of an effectiveness measure is used as a constraint to estimate the most likely distribution of relevant documents under a maximum entropy assumption. This inferred distribution may then be compared to the actual distribution to quantify the "informativeness" of the measure. The inferred distribution may also be used to estimate values for other effectiveness measures. Previous work focused on traditional effectiveness measures, such as average precision. In this paper, we extend the Maximum Entropy Method to the newer cascade and intent-aware effectiveness measures by considering the dependency of the documents ranked in a results list. These measures are intended to reflect the novelty and diversity of search results in addition to the traditional relevance. Our results indicate that intent-aware measures based on the cascade model are informative in terms of both inferring actual distribution and predicting the values of other retrieval measures.

References

[1]
R. Agrawal, S. Gollapudi, A. Halverson, and S. Ieong. Diversifying search results. In 2nd ACM International Conference on Web Search and Data Mining, pages 5--14, 2009.
[2]
J. Aslam, E. Yilmaz, and V. Pavlu. The maximum entropy method for analyzing retrieval measures. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 27--34, 2005.
[3]
C. Buckley and E. Voorhees. Evaluating evaluation measure stability. In 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 33--40, 2000.
[4]
O. Chapelle, D. Metlzer, Y. Zhang, and P. Grinspan. Expected reciprocal rank for graded relevance. In 18th ACM Conference on Information and Knowledge Management, pages 621--630, 2009.
[5]
C. L. A. Clarke, N. Craswell, and I. Soboroff. Overview of the TREC 2009 Web track. In 18th Text REtrieval Conference, 2009.
[6]
C. L. A. Clarke, N. Craswell, I. Soboroff, and A. Ashkan. A comparative analysis of cascade measures for novelty and diversity. In ACM International Conference on Web Search and Data Mining, 2011.
[7]
C. L. A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkann, S. Büttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 659--666, 2008.
[8]
C. L. A. Clarke, M. Kolla, and O. Vechtomova. An effectiveness measure for ambiguous and underspecified queries. In 2nd International Conference on the Theory of Information Retrieval, pages 188--199, 2009.
[9]
T. Cover and J. Thomas. Elements of information theory. Wiley-Interscience, 2006.
[10]
N. Craswell, O. Zoeter, M. J. Taylor, and B. Ramsey. An experimental comparison of click position-bias models. In International Conference on Web Search and Web Data Mining, pages 87--94, 2008.
[11]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002.
[12]
A. Moffat and J. Zobel. Rank-biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems, 27(1):1--27, 2008.
[13]
M. Powell. A fast algorithm for nonlinearly constrained optimization calculations. Numerical Analysis, pages 144--157, 1978.
[14]
F. Radlinski, M. Szummer, and N. Craswell. Metrics for assessing sets of subtopics. In 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 853--854, 2010.
[15]
S. Robertson, E. Kanoulas, and E. Yilmaz. Extending average precision to graded relevance judgments. In 33rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 603--610, 2010.
[16]
T. Sakai. Evaluating evaluation metrics based on the bootstrap. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, page 532, 2006.
[17]
T. Sakai. On the robustness of information retrieval metrics to biased relevance assessments. Information and Media Technologies, 4(2):547--557, 2009.
[18]
M. Sanderson and J. Zobel. Information retrieval system evaluation: effort, sensitivity, and reliability. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 162--169, 2005.
[19]
E. Yilmaz and S. Robertson. On the choice of effectiveness measures for learning to rank. Information Retrieval, 13(3):271--290, 2010.
[20]
E. Yilmaz, M. Shokouhi, N. Craswell, and S. Robertson. Incorporating user behavior information in IR evaluation. In SIGIR 2009 Workshop on Understanding the User: Logging and Interpreting User Interactions in Information Retrieval, 2009.
[21]
Y. Zhang, L. Park, and A. Moffat. Click-based evidence for decaying weight distributions in search effectiveness metrics. Information Retrieval, 13(1):46--69, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '11: Proceedings of the 20th international conference on World wide web
March 2011
840 pages
ISBN:9781450306324
DOI:10.1145/1963405
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 March 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. diversity
  2. effectiveness measures
  3. evaluation
  4. measure informativeness
  5. novelty

Qualifiers

  • Research-article

Conference

WWW '11
WWW '11: 20th International World Wide Web Conference
March 28 - April 1, 2011
Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media