skip to main content
research-article

Learning to rank query suggestions for adhoc and diversity search

Published: 01 August 2013 Publication History

Abstract

Query suggestions have become pervasive in modern web search, as a mechanism to guide users towards a better representation of their information need. In this article, we propose a ranking approach for producing effective query suggestions. In particular, we devise a structured representation of candidate suggestions mined from a query log that leverages evidence from other queries with a common session or a common click. This enriched representation not only helps overcome data sparsity for long-tail queries, but also leads to multiple ranking criteria, which we integrate as features for learning to rank query suggestions. To validate our approach, we build upon existing efforts for web search evaluation and propose a novel framework for the quantitative assessment of query suggestion effectiveness. Thorough experiments using publicly available data from the TREC Web track show that our approach provides effective suggestions for adhoc and diversity search.

References

[1]
Alonso O., Rose D. E., and Stewart B. Crowdsourcing for relevance evaluation SIGIR Forum 2008 42 2 9-15
[2]
Amati, G. (2003). Probabilistic models for information retrieval based on divergence from randomness. PhD thesis. :University of Glasgow.
[3]
Amati, G., Ambrosi, E., Bianchi, M., Gaibisso, C., & Gambosi, G. (2007). FUB, IASI-CNR and University of Tor Vergata at TREC 2007 Blog track. In Proceedings of TREC.
[4]
Baeza-Yates, R. A., Hurtado, C. A., & Mendoza, M. (2004). Query recommendation using query logs in search engines. In Proceedings of ClustWeb at EDBT (pp. 588–596).
[5]
Boldi, P., Bonchi, F., Castillo, C., Donato, D., Gionis, A., & Vigna, S. (2008). The query-flow graph: Model and applications. In Proceedings of CIKM (pp. 609–618).
[6]
Boldi, P., Bonchi, F., Castillo, C., Donato, D., & Vigna, S. (2009). Query suggestions using query-flow graphs. In Proceedings of WSCD at WSDM (pp. 56–63).
[7]
Broccolo D., Marcon L., Nardini F. M., Perego R., and Silvestri F. Generating suggestions for queries in the long tail with an inverted index Information Processing and Management 2012 48 2 326-339
[8]
Burges, C. J. C. (2010). From RankNet to LambdaRank to LambdaMART: An overview. Technical report MSR-TR-2010-82, Microsoft Research.
[9]
Carterette, B., Allan, J., & Sitaraman, R. (2006). Minimal test collections for retrieval evaluation. In Proceedings of SIGIR (pp. 268–275).
[10]
Carterette, B., Pavlu, V., Kanoulas, E., Aslam, J. A., & Allan, J. (2009). If I dad a million queries. In Proceedings of ECIR (pp. 288–300). New York: Springer.
[11]
Chapelle O. and Chang Y. Yahoo! learning to rank challenge overview Journal of Machine Learning Research 2011 14 1-24
[12]
Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009). Expected reciprocal rank for graded relevance. In Proceedings of CIKM (pp. 621–630).
[13]
Clarke, C. L. A., Craswell, N., & Soboroff, I. (2009). Overview of the TREC 2009 Web track. In Proceeding of TREC.
[14]
Clarke, C. L. A., Craswell, N., Soboroff, I., & Ashkan, A. (2011). A comparative analysis of cascade measures for novelty and diversity. In Proceedings of WSDM (pp. 75–84).
[15]
Clarke, C. L. A., Craswell, N., Soboroff, I., & Cormack, G. V. (2010). Overview of the TREC 2010 Web track. In Proceedings of TREC.
[16]
Clarke, C. L. A., Craswell, N., Soboroff, I., & Voorhees, E. M. (2011). Overview of the TREC 2011 Web track. In Proceedidngs of TREC.
[17]
Clarke, C. L. A., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., & MacKinnon, I. (2008). Novelty and diversity in information retrieval evaluation. In Proceedings of SIGIR (pp. 659–666).
[18]
Clarke, C. L. A., Kolla, M., & Vechtomova, O. (2009). An effectiveness measure for ambiguous and underspecified queries. In Proceedings of ICTIR (pp. 188–199).
[19]
Cucerzan, S., & White, R. W. (2007). Query suggestion based on user landing pages. In Proceedings of SIGIR (pp. 875–876). New York: ACM.
[20]
Dang, V., Bendersky, M., & Croft, W. B. (2010). Learning to rank query reformulations. In Proceedings of SIGIR (pp. 807–808). :ACM.
[21]
Dean, J. (2009). Challenges in building large-scale information retrieval systems: invited talk. In Proceedings of WSDM (p. 1). New York: ACM.
[22]
Downey, D., Dumais, S., & Horvitz, E. (2007). Heads and tails: studies of web search with common and rare queries. In Proceedings of SIGIR (pp. 847–848).
[23]
Fonseca B. M., Golgher P. B., De Moura E. S., Pôssas B., and Ziviani N. Discovering search engine related queries using association rules Journal of Web Engineering 2003 2 215-227
[24]
Ganjisaffar, Y., Caruana, R., & Lopes, C. (2011). Bagging gradient-boosted trees for high precision, low variance ranking models. In Proceedings of SIGIR (pp. 85–94), Beijing, China.
[25]
Hauff, C., Kelly, D., & Azzopardi, L. (2010). A comparison of user and system query performance predictions. In Proceedings of CIKM (pp. 979–988).
[26]
Jansen B. J., Spink A., Bateman J., and Saracevic T. Real life information retrieval: A study of user queries on the web SIGIR Forum 1998 32 1 5-17
[27]
Järvelin K. and Kekäläinen J. Cumulated gain-based evaluation of IR techniques ACM Transactions on Information Systems 2002 20 4 422-446
[28]
Jones, R., Rey, B., Madani, O., & Greiner, W. (2006). Generating query substitutions. In Proceedings of WWW (pp. 387–396).
[29]
Liu T.-Y. Learning to rank for information retrieval Foundations and Trends in Information Retrieval 2009 3 3 225-331
[30]
Mei, Q., Zhou, D., & Church, K. (2008). Query suggestion using hitting time. In Proceedings of CIKM (pp. 469–478).
[31]
Metzler, D. (2007). Automatic feature selection in the Markov random field model for information retrieval. In Proceedings of CIKM (pp. 253–262).
[32]
Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proceedings of SIGIR (pp. 472–479).
[33]
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., & Lioma, C. (2006). Terrier: A high performance and scalable information retrieval platform. In Proceedings of OSIR at SIGIR.
[34]
Peng, J., Macdonald, C., He, V., Plachouras, V., & Ounis, I. (2007). Incorporating term dependency in the DFR framework. In Proceedings of SIGIR. New York: ACM Press.
[35]
Qin T., Liu T.-Y., Xu J., and Li H. LETOR: A benchmark collection for research on learning to rank for information retrieval Information Retrieval 2009 13 4 347-374
[36]
Robertson, S. (2008). On the optimisation of evaluation metrics. In Proceedings of LR4IR at SIGIR.
[37]
Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M., & Gatford, M. (1994). Okapi at TREC-3. In Proceedings of TREC.
[38]
Santos, R. L. T., Macdonald, C., & Ounis, I. (2010). Exploiting query reformulations for web search result diversification. In Proceedings of WWW (pp. 881–890).
[39]
Santos, R. L. T., Macdonald, C., & Ounis, I. (2011). How diverse are web search results? In Proceedings of SIGIR (pp. 1187–1188).
[40]
Santos, R. L. T., Macdonald, C., & Ounis, I. (2011). Intent-aware search result diversification. In Proceedings of SIGIR (pp. 595–604).
[41]
Sheldon, D., Shokouhi, M., Szummer, M., & Craswell, N. (2011). LambdaMerge: merging the results of query reformulations. In Proceedings of WSDM (pp. 795–804).
[42]
Silvestri F. Mining query logs: Turning search usage data into knowledge Found. Trends Inf. Retr. 2010 4 1–2 1-174
[43]
Song R., Luo Z., Nie J.-Y., Yu Y., and Hon H.-W. Identification of ambiguous queries in web search Information Processing and Management 2009 45 2 216-229
[44]
Song, Y., Zhou, D., & Wei He, L. (2011). Post-ranking query suggestion by diversifying search results. In Proceedings of SIGIR (pp. 815–824). Beijing, China.
[45]
Spärck Jones K., Robertson S. E., and Sanderson M. Ambiguous requests: Implications for retrieval tests, systems and theories SIGIR Forum 2007 41 2 8-17
[46]
Szpektor, I., Gionis, A., & Maarek, Y. (2011). Improving recommendation for long-tail queries via templates. In Proceedings of WWW (pp. 47–56).
[47]
Wang, X., & Zhai, C. (2008). Mining term association patterns from search logs for effective query reformulation. In Proceedings of CIKM (pp. 479–488).
[48]
Zaragoza, H., Craswell, N., Taylor, M. J., Saria, S., & Robertson, S. E. (2004). Microsoft Cambridge at TREC 13: Web and hard tracks. In Proceedings of TREC.
[49]
Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. In Proceedings of SIGIR (pp. 334–342).
[50]
Zhang, Z., & Nasraoui, O. (2006). Mining search engine query logs for query recommendation. In Proceedings of WWW (pp. 1039–1040).

Cited By

View all

Index Terms

  1. Learning to rank query suggestions for adhoc and diversity search
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Information Retrieval
      Information Retrieval  Volume 16, Issue 4
      Aug 2013
      129 pages

      Publisher

      Kluwer Academic Publishers

      United States

      Publication History

      Published: 01 August 2013
      Accepted: 03 September 2012
      Received: 21 May 2012

      Author Tags

      1. Web search
      2. Learning to rank
      3. Query suggestions
      4. Relevance
      5. Diversity

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 14 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media