research-article

Novelty and diversity in information retrieval evaluation

Authors:

Charles L.A. Clarke,

Maheedhar Kolla,

Gordon V. Cormack,

Olga Vechtomova,

Stefan Büttcher,

Ian MacKinnonAuthors Info & Claims

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Pages 659 - 666

https://doi.org/10.1145/1390334.1390446

Published: 20 July 2008 Publication History

Abstract

Evaluation measures act as objective functions to be optimized by information retrieval systems. Such objective functions must accurately reflect user requirements, particularly when tuning IR systems and learning ranking functions. Ambiguity in queries and redundancy in retrieved documents are poorly reflected by current evaluation measures. In this paper, we present a framework for evaluation that systematically rewards novelty and diversity. We develop this framework into a specific evaluation measure, based on cumulative gain. We demonstrate the feasibility of our approach using a test collection based on the TREC question answering track.

References

[1]

E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 19--26, Seattle, August 2006.

Digital Library

[2]

E. Agichtein, E. Brill, S. Dumais, and R. Ragno. Learning user interaction models for predicting Web search result preferences. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3--10, Seattle, August 2006.

Digital Library

[3]

A. Al-Maskari, M. Sanderson, and P. Clough. The relationship between IR effectiveness measures and user satisfaction. In 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 773--774, 2007.

Digital Library

[4]

G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness, and selective application of query expansion. In 26th European Conference on IR Research, pages 127--137, Sunderland, UK, 2004.

[5]

Y. Bernstein and J. Zobel. Redundant documents and search effectiveness. In 14th ACM International Conference on Information and Knowledge Management, pages 736--743, 2005.

Digital Library

[6]

B. Boyce. Beyond topicality: A two stage view of relevance and the retrieval process. Information Processing & Management, 18(3):105--109, 1982.

[7]

A. Broder. A taxonomy of Web search. SIGIR Forum, 36(2):3--10, 2002.

Digital Library

[8]

C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 25--32, 2004.

Digital Library

[9]

C. J. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In 22nd International Conference on Machine Learning, pages 89--96, Bonn, Germany, 2005.

Digital Library

[10]

S. Büttcher, C. L. A. Clarke, P. C. K. Yeung, and I. Soboroff. Reliable information retrieval evaluation with incomplete and biased judgements. In 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 63--70, 2007.

Digital Library

[11]

J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 335--336, 1998.

Digital Library

[12]

C. Carpineto, R. de Mori, G. Romano, and B. Bigi. An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems, 19(1):1--27, 2001.

Digital Library

[13]

H. Chen and D. R. Karger. Less is more: Probabilistic models for retrieving fewer relevant documents. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 429--436, 2006.

Digital Library

[14]

S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 299--306, 2002.

Digital Library

[15]

H. T. Dang, J. Lin, and D. Kelly. Overview of the TREC 2006 question answering track. In 15th Text REtrieval Conference, Gaithersburg, Maryland, 2006.

[16]

G. Dupret, V. Murdock, and B. Piwowarski. Web search engine evaluation using clickthrough data and a user model. In 16th International World Wide Web Conference, 2007.

[17]

W. Goffman. A searching procedure for information retrieval. Information Storage and Retrieval, 2:73--78, 1964.

[18]

D. K. Harman. The TREC test collections. In Ellen M. Voorhees and Donna K. Harman, editors, TREC: Experiment and Evaluation in Information Retrieval, chapter 2, pages 21--52. The MIT Press, 2005.

[19]

B. He and I. Ounis. Query performance prediction. Information Systems, 31:585--594, 2006.

Digital Library

[20]

K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002.

Digital Library

[21]

T. Joachims, L. Granka, B. Pan, H. Hembrooke, and Geri Gay. Accurately interpreting clickthrough data as implicit feedback. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 154--161, Salvador, Brazil, August 2005.

Digital Library

[22]

J. Lafferty and C. Zhai. Probabilistic relevance models based on document and query generation. In W. B. Croft and J. Lafferty, editors, Language Modeling for Information Retrieval, chapter 1, pages 1--10. Kluwer Academic Publishers, 2003.

[23]

J. Lin and B. Katz. Building a reusable test collection for question answering. Journal of the American Society for Information Science and Technology, 57(7):851--861, 2006.

Digital Library

[24]

S. Robertson. The probability ranking principle in IR. Journal of Documentation, 33:294--304, 1977.

[25]

K. Spärck Jones, S. E. Robertson, and M. Sanderson. Ambiguous requests: Implications for retrieval tests. SIGIR Forum, 41(2):8--17, 2007.

Digital Library

[26]

K. Spärck Jones, S. Walker, and S. E. Robertson. A probabilistic model of information retrieval: development and comparative experiments - Part 1. Information Processing & Management, 36(6):779--808, 2000.

Digital Library

[27]

E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. In 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 315--323, 1998.

Digital Library

[28]

E. M. Voorhees and H. T. Dang. Overview of the TREC 2005 question answering track. In 14th Text REtrieval Conference, Gaithersburg, Maryland, 2005.

[29]

Y. Xu and Hainan Yin. Novelty and topicality in interactive information retrieval. Journal of the American Society for Information Science and Technology, 59(2):201--215, 2008.

Digital Library

[30]

Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 271--278, 2007.

Digital Library

[31]

C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 10--17, 2003.

Digital Library

[32]

C. Zhai and J. Lafferty. A risk minimization framework for information retrieval. Information Processing & Management, 42:31--55, 2006.

Digital Library

Cited By

Garba AKhalid SAleryni AUllah ITairan NShah HMumin D(2024)Utilizing Ant Colony Optimization for Result Merging in Federated SearchEngineering, Technology & Applied Science Research10.48084/etasr.730214:4(14832-14839)Online publication date: 2-Aug-2024
https://doi.org/10.48084/etasr.7302
Bigdeli AArabzadeh NBagheri EClarke CSakai TIshita EOhshima HHasibi FMao JJose J(2024)Evaluating Relative Retrieval Effectiveness with Normalized Residual GainProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698410(64-71)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698410
Wu HMitra BCraswell NOosterhuis HBast HXiong C(2024)Towards Group-aware Search SuccessProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672526(123-131)Online publication date: 5-Aug-2024
https://doi.org/10.1145/3664190.3672526
Show More Cited By

Index Terms

Novelty and diversity in information retrieval evaluation
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking

Recommendations

Low cost evaluation in information retrieval
SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Search corpora are growing larger and larger: over the last 10 years, the IR research community has moved from the several hundred thousand documents on the TREC disks to the tens of millions of U.S. government web pages of GOV2 to the one billion ...
Current Status of the Evaluation of Information Retrieval

This is the second in the series of the articles on an application of the systems analytic approach to evaluation of information retrieval (IR). In the previous article a historical overview of IR was presented and existing terminological problems ...
On information retrieval metrics designed for evaluation with incomplete relevance assessments

Modern information retrieval (IR) test collections have grown in size, but the available manpower for relevance assessments has more or less remained constant. Hence, how to reliably evaluate and compare IR systems using incomplete relevance data, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

July 2008

934 pages

ISBN:9781605581644

DOI:10.1145/1390334

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Mun-Kew Leong
National Library Board, Singapore
,
Program Chairs:
Syung Hyon Myaeng
Information and Communications University, Korea
,
Douglas W. Oard
University of Maryland, College Park, USA
,
Fabrizio Sebastiani
Consiglio Nazionale delle Ricerche, Italy

Copyright © 2008 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2008

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '08

Sponsor:

SIGIR '08: The 31st Annual International ACM SIGIR Conference

July 20 - 24, 2008

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

646
Total Citations
View Citations
4,913
Total Downloads

Downloads (Last 12 months)216
Downloads (Last 6 weeks)34

Reflects downloads up to 14 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Garba AKhalid SAleryni AUllah ITairan NShah HMumin D(2024)Utilizing Ant Colony Optimization for Result Merging in Federated SearchEngineering, Technology & Applied Science Research10.48084/etasr.730214:4(14832-14839)Online publication date: 2-Aug-2024
https://doi.org/10.48084/etasr.7302
Bigdeli AArabzadeh NBagheri EClarke CSakai TIshita EOhshima HHasibi FMao JJose J(2024)Evaluating Relative Retrieval Effectiveness with Normalized Residual GainProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698410(64-71)Online publication date: 8-Dec-2024
https://dl.acm.org/doi/10.1145/3673791.3698410
Wu HMitra BCraswell NOosterhuis HBast HXiong C(2024)Towards Group-aware Search SuccessProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672526(123-131)Online publication date: 5-Aug-2024
https://doi.org/10.1145/3664190.3672526
Su ZDou ZZhu YWen J(2024)Passage-aware Search Result DiversificationACM Transactions on Information Systems10.1145/365367242:5(1-29)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3653672
Deng ZDou ZSu ZWen J(2024)Multi-grained Document Modeling for Search Result DiversificationACM Transactions on Information Systems10.1145/365285242:5(1-22)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3652852
Al Jurdi WAbdo JDemerjian JMakhoul A(2024)Group Validation in Recommender Systems: Framework for Multi-layer Performance EvaluationACM Transactions on Recommender Systems10.1145/36408202:1(1-25)Online publication date: 19-Jan-2024
https://dl.acm.org/doi/10.1145/3640820
Jeunen OPotapov IUstimenko ABaeza-Yates RBonchi F(2024)On (Normalised) Discounted Cumulative Gain as an Off-Policy Evaluation Metric for Top-n RecommendationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671687(1222-1233)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671687
Vrijenhoek SBénédict GGutierrez Granada MOdijk D(2024)RADio* – An Introduction to Measuring Normative Diversity in News RecommendationsACM Transactions on Recommender Systems10.1145/36364653:1(1-29)Online publication date: 2-Aug-2024
https://dl.acm.org/doi/10.1145/3636465
Pandey SDas SGanu HSingh SBalsamo SKnottenbelt WAbad CShang W(2024)Rethinking 'Complement' Recommendations at Scale with SIMDProceedings of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629526.3645041(25-36)Online publication date: 7-May-2024
https://dl.acm.org/doi/10.1145/3629526.3645041
Li XCong GXiao GXu YJiang WLi KSerra ESpezzano F(2024)On Evaluation Metrics for Diversity-enhanced RecommendationsProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679629(1286-1295)Online publication date: 21-Oct-2024
https://doi.org/10.1145/3627673.3679629
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents