skip to main content
10.1145/1390334.1390446acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Novelty and diversity in information retrieval evaluation

Published: 20 July 2008 Publication History

Abstract

Evaluation measures act as objective functions to be optimized by information retrieval systems. Such objective functions must accurately reflect user requirements, particularly when tuning IR systems and learning ranking functions. Ambiguity in queries and redundancy in retrieved documents are poorly reflected by current evaluation measures. In this paper, we present a framework for evaluation that systematically rewards novelty and diversity. We develop this framework into a specific evaluation measure, based on cumulative gain. We demonstrate the feasibility of our approach using a test collection based on the TREC question answering track.

References

[1]
E. Agichtein, E. Brill, and S. Dumais. Improving web search ranking by incorporating user behavior information. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 19--26, Seattle, August 2006.
[2]
E. Agichtein, E. Brill, S. Dumais, and R. Ragno. Learning user interaction models for predicting Web search result preferences. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 3--10, Seattle, August 2006.
[3]
A. Al-Maskari, M. Sanderson, and P. Clough. The relationship between IR effectiveness measures and user satisfaction. In 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 773--774, 2007.
[4]
G. Amati, C. Carpineto, and G. Romano. Query difficulty, robustness, and selective application of query expansion. In 26th European Conference on IR Research, pages 127--137, Sunderland, UK, 2004.
[5]
Y. Bernstein and J. Zobel. Redundant documents and search effectiveness. In 14th ACM International Conference on Information and Knowledge Management, pages 736--743, 2005.
[6]
B. Boyce. Beyond topicality: A two stage view of relevance and the retrieval process. Information Processing & Management, 18(3):105--109, 1982.
[7]
A. Broder. A taxonomy of Web search. SIGIR Forum, 36(2):3--10, 2002.
[8]
C. Buckley and E. M. Voorhees. Retrieval evaluation with incomplete information. In 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 25--32, 2004.
[9]
C. J. C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, and G. Hullender. Learning to rank using gradient descent. In 22nd International Conference on Machine Learning, pages 89--96, Bonn, Germany, 2005.
[10]
S. Büttcher, C. L. A. Clarke, P. C. K. Yeung, and I. Soboroff. Reliable information retrieval evaluation with incomplete and biased judgements. In 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 63--70, 2007.
[11]
J. Carbonell and J. Goldstein. The use of MMR, diversity-based reranking for reordering documents and producing summaries. In 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 335--336, 1998.
[12]
C. Carpineto, R. de Mori, G. Romano, and B. Bigi. An information-theoretic approach to automatic query expansion. ACM Transactions on Information Systems, 19(1):1--27, 2001.
[13]
H. Chen and D. R. Karger. Less is more: Probabilistic models for retrieving fewer relevant documents. In 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 429--436, 2006.
[14]
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 299--306, 2002.
[15]
H. T. Dang, J. Lin, and D. Kelly. Overview of the TREC 2006 question answering track. In 15th Text REtrieval Conference, Gaithersburg, Maryland, 2006.
[16]
G. Dupret, V. Murdock, and B. Piwowarski. Web search engine evaluation using clickthrough data and a user model. In 16th International World Wide Web Conference, 2007.
[17]
W. Goffman. A searching procedure for information retrieval. Information Storage and Retrieval, 2:73--78, 1964.
[18]
D. K. Harman. The TREC test collections. In Ellen M. Voorhees and Donna K. Harman, editors, TREC: Experiment and Evaluation in Information Retrieval, chapter 2, pages 21--52. The MIT Press, 2005.
[19]
B. He and I. Ounis. Query performance prediction. Information Systems, 31:585--594, 2006.
[20]
K. Järvelin and J. Kekäläinen. Cumulated gain-based evaluation of ir techniques. ACM Transactions on Information Systems, 20(4):422--446, 2002.
[21]
T. Joachims, L. Granka, B. Pan, H. Hembrooke, and Geri Gay. Accurately interpreting clickthrough data as implicit feedback. In 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 154--161, Salvador, Brazil, August 2005.
[22]
J. Lafferty and C. Zhai. Probabilistic relevance models based on document and query generation. In W. B. Croft and J. Lafferty, editors, Language Modeling for Information Retrieval, chapter 1, pages 1--10. Kluwer Academic Publishers, 2003.
[23]
J. Lin and B. Katz. Building a reusable test collection for question answering. Journal of the American Society for Information Science and Technology, 57(7):851--861, 2006.
[24]
S. Robertson. The probability ranking principle in IR. Journal of Documentation, 33:294--304, 1977.
[25]
K. Spärck Jones, S. E. Robertson, and M. Sanderson. Ambiguous requests: Implications for retrieval tests. SIGIR Forum, 41(2):8--17, 2007.
[26]
K. Spärck Jones, S. Walker, and S. E. Robertson. A probabilistic model of information retrieval: development and comparative experiments - Part 1. Information Processing & Management, 36(6):779--808, 2000.
[27]
E. M. Voorhees. Variations in relevance judgments and the measurement of retrieval effectiveness. In 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 315--323, 1998.
[28]
E. M. Voorhees and H. T. Dang. Overview of the TREC 2005 question answering track. In 14th Text REtrieval Conference, Gaithersburg, Maryland, 2005.
[29]
Y. Xu and Hainan Yin. Novelty and topicality in interactive information retrieval. Journal of the American Society for Information Science and Technology, 59(2):201--215, 2008.
[30]
Y. Yue, T. Finley, F. Radlinski, and T. Joachims. A support vector method for optimizing average precision. In 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 271--278, 2007.
[31]
C. Zhai, W. W. Cohen, and J. Lafferty. Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. In 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 10--17, 2003.
[32]
C. Zhai and J. Lafferty. A risk minimization framework for information retrieval. Information Processing & Management, 42:31--55, 2006.

Cited By

View all
  • (2024)Utilizing Ant Colony Optimization for Result Merging in Federated SearchEngineering, Technology & Applied Science Research10.48084/etasr.730214:4(14832-14839)Online publication date: 2-Aug-2024
  • (2024)Evaluating Relative Retrieval Effectiveness with Normalized Residual GainProceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region10.1145/3673791.3698410(64-71)Online publication date: 8-Dec-2024
  • (2024)Towards Group-aware Search SuccessProceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3664190.3672526(123-131)Online publication date: 5-Aug-2024
  • Show More Cited By

Index Terms

  1. Novelty and diversity in information retrieval evaluation

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
    July 2008
    934 pages
    ISBN:9781605581644
    DOI:10.1145/1390334
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 July 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. evaluation
    2. novelty
    3. test collections

    Qualifiers

    • Research-article

    Conference

    SIGIR '08
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 792 of 3,983 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)216
    • Downloads (Last 6 weeks)34
    Reflects downloads up to 14 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media