skip to main content
10.1145/1076034.1076155acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Predicting query difficulty on the web by learning visual clues

Published: 15 August 2005 Publication History

Abstract

We describe a method for predicting query difficulty in a precision-oriented web search task. Our approach uses visual features from retrieved surrogate document representations (titles, snippets, etc.) to predict retrieval effectiveness for a query. By training a supervised machine learning algorithm with manually evaluated queries, visual clues indicative of relevance are discovered. We show that this approach has a moderate correlation of 0.57 with precision at 10 scores from manual relevance judgments of the top ten documents retrieved by ten web search engines over 896 queries. Our findings indicate that difficulty predictors which have been successful in recall-oriented ad-hoc search, such as clarity metrics, are not nearly as correlated with engine performance in precision-oriented tasks such as this, yielding a maximum correlation of 0.3. Additionally, relying only on visual clues avoids the need for collection statistics that are required by these prior approaches. This enables our approach to be employed in environments where these statistics are unavailable or costly to retrieve, such as metasearch.

References

[1]
E. C. Jensen, et al. A Framework for Determining Necessary Query Set Sizes to Evaluate Web Search Effectiveness. In Proceedings of WWW'05.
[2]
S. Cronen-Townsend, et al. Predicting Query Performance. In Proceedings of SIGIR'02.
[3]
B. He and I. Ounis. Inferring Query Performance Using Pre-Retrieval Predictors. In Proceedings of SPIRE'04.

Cited By

View all

Index Terms

  1. Predicting query difficulty on the web by learning visual clues

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
      August 2005
      708 pages
      ISBN:1595930345
      DOI:10.1145/1076034
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 August 2005

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. query difficulty
      2. web search

      Qualifiers

      • Article

      Conference

      SIGIR05
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 06 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media