skip to main content
10.1007/978-3-642-20161-5_39guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Smoothing Click Counts for Aggregated Vertical Search

Published: 18 April 2011 Publication History

Abstract

Clickthrough data is a critical feature for improving web search ranking. Recently, many search portals have provided aggregated search, which retrieves relevant information from various heterogeneous collections called verticals. In addition to the well-known problem of rank bias, clickthrough data recorded in the aggregated search environment suffers from severe sparseness problems due to the limited number of results presented for each vertical. This skew in clickthrough data, which we call rank cut, makes optimization of vertical searches more difficult. In this work, we focus on mitigating the negative effect of rank cut for aggregated vertical searches. We introduce a technique for smoothing click counts based on spectral graph analysis. Using real clickthrough data from a vertical recorded in an aggregated search environment, we show empirically that clickthrough data smoothed by this technique is effective for improving the vertical search.

References

[1]
Acton, F.S.: Numerical Methods that Work, 2nd edn. The Mathematical Association of America 1997
[2]
Agichtein, E., Brill, E., Dumais, S.: Improving web search ranking by incorporating user behavior information. In: SIGIR 2006, pp. 19---26 2006
[3]
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. In: NIPS, vol. 14, pp. 585---591 2001
[4]
Bengio, Y., Delalleau, O., Le Roux, N.: Label propagation and quadratic criterion. In: Chapelle, O., Schölkopf, B., Zien, A. eds. Semi-Supervised Learning, pp. 193---216. MIT Press, Cambridge 2006
[5]
Chapelle, O., Zhang, Y.: A dynamic bayesian network click model for web search ranking. In: WWW 2009, pp. 1---10 2009
[6]
Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society, Providence 1997
[7]
Croft, W.B., Lafferty, J.: Language Modeling for Information Retrieval. Kluwer Academic Publishers, Norwell 2003
[8]
Diaz, F.: Regularizing ad hoc retrieval scores. In: CIKM 2005, pp. 672---679 2005
[9]
Dupret, G.E., Piwowarski, B.: A user browsing model to predict search engine click data from past observations. In: SIGIR 2008, pp. 331---338 2008
[10]
Gao, J., Yuan, W., Li, X., Deng, K., Nie, J.Y.: Smoothing clickthrough data for web search ranking. In: SIGIR 2009, pp. 355---362 2009
[11]
Jeon, J., Croft, W.B., Lee, J.H., Park, S.: A framework to predict the quality of answers with non-textual features. In: SIGIR 2006, pp. 228---235 2006
[12]
Joachims, T.: Optimizing search engines using clickthrough data. In: KDD 2002, pp. 133---142 2002
[13]
Joachims, T.: Transductive learning via spectral graph partitioning. In: ICML 2003, pp. 290---297 2003
[14]
Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR 2005, pp. 154---161 2005
[15]
Lafferty, J., Lebanon, G.: Diffusion kernels on statistical manifolds. The Journal of Machine Learning Research 6, 129---163 2005
[16]
Li, X., Wang, Y.Y., Acero, A.: Learning query intent from regularized click graphs. In: SIGIR 2008, pp. 339---346 2008
[17]
Murdock, V., Lalmas, M.: Workshop on aggregated search. SIGIR Forum 422, 80---83 2008
[18]
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: Bringing order to the web. Tech. Rep. 1999-66, Stanford InfoLab 1999
[19]
Radlinski, F., Joachims, T.: Active exploration for learning rankings from clickthrough data. In: KDD 2007, pp. 570---579 2007
[20]
Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligence Analysis 2005
[21]
Szummer, M., Jaakkola, T.: Partially labeled classification with markov random walks. In: Dietterich, T., et al. eds. Advances in Neural Information Processing Systems, vol. 14. MIT Press, Cambridge 2001
[22]
Van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Dept. of Computer Science, University of Glasgow 1979
[23]
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: SIGIR 2001, pp. 334---342 2001
[24]
Zhu, X., Ghahramani, Z.: Learning from labeled and unlabeled data with label propagation. Tech. Rep. CMU-CALD-02-107, Carnegie Mellon University 2002

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ECIR 2011: Proceedings of the 33rd European Conference on Advances in Information Retrieval - Volume 6611
April 2011
792 pages
ISBN:9783642201608
  • Editors:
  • Paul Clough,
  • Colum Foley,
  • Cathal Gurrin,
  • Gareth Jones,
  • Wessel Kraaij,
  • Hyowon Lee,
  • Vanessa Mudoch

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 18 April 2011

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media