skip to main content
10.1007/978-3-642-36973-5_12guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Using document-quality measures to predict web-search effectiveness

Published: 24 March 2013 Publication History

Abstract

The query-performance prediction task is estimating retrieval effectiveness in the absence of relevance judgments. The task becomes highly challenging over theWeb due to, among other reasons, the effect of low quality (e.g., spam) documents on retrieval performance. To address this challenge, we present a novel prediction approach that utilizes queryindependent document-quality measures. While using these measures was shown to improve Web-retrieval effectiveness, this is the first study demonstrating the clear merits of using them for query-performance prediction. Evaluation performed with large scale Web collections shows that our methods post prediction quality that often surpasses that of state-of-the-art predictors, including those devised specifically for Web retrieval.

References

[1]
Balasubramanian, N., Kumaran, G., Carvalho, V. R.: Predicting query performance on the web. In: Proc. of SIGIR, pp. 785-786 (2010)
[2]
Bendersky, M., Croft, W. B., Diao, Y.: Quality-biased ranking of web documents. In: Proc. of WSDM, pp. 95-104 (2011)
[3]
Bernstein, Y., Billerbeck, B., Garcia, S., Lester, N., Scholer, F., Zobel, J.: RMIT university at trec 2005: Terabyte and robust track. In: Proc. of TREC-14 (2005)
[4]
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proc. of WWW, pp. 107-117 (1998)
[5]
Carmel, D., Yom-Tov, E.: Estimating the Query Difficulty for Information Retrieval. In: Synthesis Lectures on Information Concepts, Retrieval, and Services. Morgan & Claypool (2010)
[6]
Carmel, D., Yom-Tov, E., Darlow, A., Pelleg, D.: What makes a query difficult? In: Proc. of SIGIR, pp. 390-397 (2006)
[7]
Clarke, C. L. A., Craswell, N., Soboroff, I.: Overview of the trec 2009 web track. In: Proc. of TREC (2009)
[8]
Cormack, G. V., Smucker, M. D., Clarke, C. L. A.: Efficient and effective spam filtering and re-ranking for large web datasets. Information Retrieval 14(5), 441-465 (2011)
[9]
Cronen-Townsend, S., Zhou, Y., Croft, W. B.: Predicting query performance. In: Proc. of SIGIR, pp. 299-306 (2002)
[10]
Diaz, F.: Performance prediction using spatial autocorrelation. In: Proc. of SIGIR, pp. 583-590 (2007)
[11]
Gyöngyi, Z., Garcia-Molina, H.: Web spam taxonomy. In: Proc. of AIRWeb, pp. 39-47 (2005)
[12]
Hauff, C., Kelly, D., Azzopardi, L.: A comparison of user and system query performance predictions. In: Proc. of CIKM, pp. 979-988 (2010)
[13]
Hauff, C., Murdock, V., Baeza-Yates, R. A.: Improved query difficulty prediction for the web. In: Proc. of CIKM, pp. 439-448 (2008)
[14]
Hummel, S., Shtok, A., Raiber, F., Kurland, O., Carmel, D.: Clarity re-visited. In: Proc. of SIGIR, pp. 1039-1040 (2012)
[15]
Kurland, O., Lee, L.: PageRank without hyperlinks: Structural re-ranking using links induced by language models. In: Proc. of SIGIR, pp. 306-313 (2005)
[16]
Lavrenko, V., Croft, W. B.: Relevance-based language models. In: Proc. of SIGIR, pp. 120-127 (2001)
[17]
Lin, J., Metzler, D., Elsayed, T., Wang, L.: Of Ivory and Smurfs: Loxodontan MapReduce Experiments for Web Search. In: Proc. of TREC 2009 (2010)
[18]
Shtok, A., Kurland, O., Carmel, D.: Predicting Query Performance by Query-Drift Estimation. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 305-312. Springer, Heidelberg (2009)
[19]
Shtok, A., Kurland, O., Carmel, D.: Using statistical decision theory and relevance models for query-performance prediction. In: Proc. of SIGIR (2010)
[20]
Song, F., Croft, W. B.: A general language model for information retrieval (poster abstract). In: Proc. of SIGIR, pp. 279-280 (1999)
[21]
Tomlinson, S.: Robust, Web and Terabyte Retrieval with Hummingbird Search Server at TREC 2004. In: Proc. of TREC-13 (2004)
[22]
Vinay, V., Cox, I. J., Milic-Frayling, N., Wood, K. R.: On ranking the effectiveness of searches. In: Proc. of SIGIR, pp. 398-404 (2006)
[23]
Voorhees, E. M.: Overview of the TREC 2004 Robust Retrieval Track. In: Proc. of TREC-13 (2004)
[24]
Zhai, C., Lafferty, J. D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proc. of SIGIR, pp. 334-342 (2001)
[25]
Zhao, Y., Scholer, F., Tsegay, Y.: Effective Pre-retrieval Query Performance Prediction Using Similarity and Variability Evidence. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R. W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 52-64. Springer, Heidelberg (2008)
[26]
Zhou, Y., Croft, B.: Ranking robustness: a novel framework to predict query performance. In: Proc. of CIKM, pp. 567-574 (2006)
[27]
Zhou, Y., Croft, B.: Query performance prediction in web search environments. In: Proc. of SIGIR, pp. 543-550 (2007)
[28]
Zhou, Y., Croft, W. B.: Document quality models for web ad hoc retrieval. In: Proc. of CIKM, pp. 331-332 (2005)
[29]
Zhu, X., Gauch, S.: Incorporating quality metrics in centralized/distributed information retrieval on the world wide web. In: Proc. of SIGIR, pp. 288-295 (2000)

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ECIR'13: Proceedings of the 35th European conference on Advances in Information Retrieval
March 2013
890 pages
ISBN:9783642369728
  • Editors:
  • Pavel Serdyukov,
  • Pavel Braslavski,
  • Sergei O. Kuznetsov,
  • Jaap Kamps,
  • Stefan Rüger

Sponsors

  • MRU: Mail.Ru
  • Google Inc.
  • ABBYY: ABBYY
  • RFBR: Russian Foundation for Basic Research
  • Yahoo! Labs

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 24 March 2013

Author Tags

  1. query-performance prediction
  2. web retrieval

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media