skip to main content
column

IR evaluation methods for retrieving highly relevant documents

Published: 02 August 2017 Publication History

Abstract

This paper proposes evaluation methods based on the use of non-dichotomous relevance judgements in IR experiments. It is argued that evaluation methods should credit IR methods for their ability to retrieve highly relevant documents. This is desirable from the user point of view in modem large IR environments. The proposed methods are (1) a novel application of P-R curves and average precision computations based on separate recall bases for documents of different degrees of relevance, and (2) two novel measures computing the cumulative gain the user obtains by examining the retrieval result up to a given ranked position. We then demonstrate the use of these evaluation methods in a case study on the effectiveness of query types, based on combinations of query structures and expansion, in retrieving documents of various degrees of relevance. The test was run with a best match retrieval system (In- Query I) in a text database consisting of newspaper articles. The results indicate that the tested strong query structures are most effective in retrieving highly relevant documents. The differences between the query types are practically essential and statistically significant. More generally, the novel evaluation methods and the case demonstrate that non-dichotomous relevance assessments are applicable in IR experiments, may reveal interesting phenomena, and allow harder testing of IR methods.

References

[1]
J. Allan, J. Callan, B. Croft, L. Ballesteros, J. Broglio, J. Xu & H. Shu. INQUERY at TREC 5. In E.M. Voorhees & D.K. Harrnan (Eds.), Information technology: The Fifth Text Retrieval Conference (TREC-5). Gaithersburg, MD: National Institute of Standards and Technology, 119--132, 1997.
[2]
D.C. Blair, & M.E. Maron. An evaluation of retrieval effecuveness for a full-text document-retrieval system. Communications of the A CM, 28(3): 289--299, 1985.
[3]
P. Borlund & P. Ingwersen. Measures of relative relevance and ranked half-life: Performance indicators for interactive IR. In W.B. Croft, A. Moffat, C.J. van Rijsbergen, R. Wilkinson & J. Zobel (Eds.), Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 324--331, 1998.
[4]
W.J. Conover. Practical nonparametric statistics (2nd ed.). New York: John Wiley & Sons, 1980.
[5]
R. Green. The expression of conceptual syntagmatic relationships: A comparative survey. Journal of Documentation, 51(4): 315--338, 1995.
[6]
W.R. Hersh & D.H. Hickam. An evaluation of interactive Boolean and natural language searching with an online medical textbook. Journal of the American Society for Information Science, 46(7): 478--489, 1995.
[7]
P. Ingwersen & P. Willett. An introduction to algorithmic and cognitive approaches for information retrieval. Libri, 450: 160--177, 1995.
[8]
E.M. Keen. The use of term position devices in ranked output experiments. Journal of Documentation, 47(1): 1--22, 1991.
[9]
J. Kekäläinen. The effects of query complexity, expansion and structure on retrieval performance in probabilistic text retrieval. Ph.D. dissertation. Department of Information Studies, University of Tampere, 1999.
[10]
J. Kekäläinen & K. Järvelin. The co-effects of query structure and expansion on retrieval performance in probabilistic text retrieval. Information Retrieval, 1(4): 329--344, 2000.
[11]
J. Kekäläinen & K. Järvelin. The impact of query structure and query expansion on retrieval performance. In W.B. Croft, A. Moffat, C.J. van Rijsbergen, R. Wilkinson & J. Zobei (Eds.), Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 130--137, 1998.
[12]
R.M. Losee. Text retrieval and filtering: Analytic models of performance. Kluwer Acadermc Publishers: Boston, 1998.
[13]
T.B. Rajashekar & W.B. Croft. Combining automatic and manual index representatzons m probabilist~c retrieval. Journal of the American Society for Information Science, 46(4): 272--283, 1995.
[14]
S.E. Robertson & N.J. Belkin. Ranking in principle. Journal of Documentation, 34(2): 93--100, 1978.
[15]
T. Saracevic, P. Kantor, A. Chamis & D. Tnvison. A study of information seeking and retrieving. I. Background and methodology. Journal of the American Society for Information Science, 39(3): 161--176, 1988.
[16]
S. Smithson. Information retrieval evaluation in practice: A case study approach. Information Processing & Management, 30(2): 205--221, 1994.
[17]
E. Sormunen. A Method for Measuring Wtde Range Performance of Boolean Queries in Full-Text Databases. Ph.D. d~ssertation. Department of Information Studies, University of Tampere, 2000.
[18]
H.R. Turtle. Inference networks for document retrieval. Ph.D. dissertation. Computer and information Science Department, University of Massachusetts, 1990.

Cited By

View all

Index Terms

  1. IR evaluation methods for retrieving highly relevant documents
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM SIGIR Forum
        ACM SIGIR Forum  Volume 51, Issue 2
        SIGIR Test-of-Time Awardees 1978-2001
        July 2017
        276 pages
        ISSN:0163-5840
        DOI:10.1145/3130348
        • Editors:
        • Donna Harman,
        • Diane Kelly
        Issue’s Table of Contents
        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 02 August 2017
        Published in SIGIR Volume 51, Issue 2

        Check for updates

        Qualifiers

        • Column

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)126
        • Downloads (Last 6 weeks)8
        Reflects downloads up to 17 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2025)Smart and user-centric manufacturing information recommendation using multimodal learning to support human-robot collaboration in mixed reality environmentsRobotics and Computer-Integrated Manufacturing10.1016/j.rcim.2024.10283691(102836)Online publication date: Feb-2025
        • (2024)Horse race rank prediction using learning-to-rank approachesKorean Journal of Applied Statistics10.5351/KJAS.2024.37.2.23937:2(239-253)Online publication date: 30-Apr-2024
        • (2024)A topic relevance-aware click model for web searchJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23689446:4(8961-8974)Online publication date: 18-Apr-2024
        • (2024)GS2PProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/936(8433-8438)Online publication date: 3-Aug-2024
        • (2024)Learning fair representations for recommendation via information bottleneck principleProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/273(2469-2477)Online publication date: 3-Aug-2024
        • (2024)Distributional off-policy evaluation for slate recommendationsProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i8.28667(8265-8273)Online publication date: 20-Feb-2024
        • (2024)Root cause explanation of outliers under noisy mechanismsProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i18.30035(20508-20515)Online publication date: 20-Feb-2024
        • (2024)Longitudinal Impact of Preference Biases on Recommender Systems’ PerformanceInformation Systems Research10.1287/isre.2021.013335:4(1634-1656)Online publication date: Dec-2024
        • (2024)Query Augmentation with Brain SignalsProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681658(7561-7570)Online publication date: 28-Oct-2024
        • (2024)Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I.Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671883(2307-2317)Online publication date: 25-Aug-2024
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media