skip to main content
10.1145/1772690.1772714acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Actively predicting diverse search intent from user browsing behaviors

Published: 26 April 2010 Publication History

Abstract

This paper is concerned with actively predicting search intent from user browsing behavior data. In recent years, great attention has been paid to predicting user search intent. However, the prediction was mostly passive because it was performed only after users submitted their queries to search engines. It is not considered why users issued these queries, and what triggered their information needs. According to our study, many information needs of users were actually triggered by what they have browsed. That is, after reading a page, if a user found something interesting or unclear, he/she might have the intent to obtain further information and accordingly formulate a search query. Actively predicting such search intent can benefit both search engines and their users. In this paper, we propose a series of technologies to fulfill this task. First, we extract all the queries that users issued after reading a given page from user browsing behavior data. Second, we learn a model to effectively rank these queries according to their likelihoods of being triggered by the page. Third, since search intents can be quite diverse even if triggered by the same page, we propose an optimization algorithm to diversify the ranked list of queries obtained in the second step, and then suggest the list to users. We have tested our approach on large-scale user browsing behavior data obtained from a commercial search engine. The experimental results have shown that our approach can predict meaningful queries for a given page, and the search performance for these queries can be significantly improved by using the triggering page as contextual information.

References

[1]
R. A. Baeza-Yates, C. A. Hurtado, and M. Mendoza. Query recommendation using query logs in search engines. In EDBT Workshops, pages 588--596, 2004.
[2]
R. A. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison Wesley, 1999.
[3]
H. Cao, D. Jiang, J. Pei, Q. He, Z. Liao, E. Chen, and H. Li. Context-aware query suggestion by mining click-through and session data. In KDD, pages 875--883, 2008.
[4]
C.J.C. Burges, T. Shaked, E. Renshaw, A. Lazier, M. Deeds, N. Hamilton, G. Hullender. Learning to Rank using Gradient Descent. In ICML, pages 89--96, 2005.
[5]
Z. Cao, T. Qin, T.-Y. Liu, M.-F. Tsai, and H. Li. Learning to rank: From pairwise approach to listwise approach. In ICML, pages 129--136, 2007.
[6]
O. Chapelle, B. Scholkopf, and A. Zien. Semi-Supervised Learning. MIT Press, Cambridge, MA, 2006.
[7]
C. L.A. Clarke, M. Kolla, G. V. Cormack, O. Vechtomova, A. Ashkan, S. Buttcher, and I. MacKinnon. Novelty and diversity in information retrieval evaluation. In SIGIR, pages 659--666, 2008.
[8]
D. Cossock and T. Zhang. Subset ranking using regression. In COLT, pages 605--619, 2006.
[9]
Y. Freund, R. Iyer, R. Schapire, and Y. Singer. An efficient boosting algorithm for combining preferences. Journal of Machine Learning Research, 2003 (4).
[10]
S. Gollapudi and A. Sharma. An Axiomatic Approach for Result Diversification. In WWW, pages 381--390, 2009.
[11]
M. P. Grineva, M. N. Grinev, and D. Lizorkin. Extracting key terms from noisy and multitheme documents. In WWW, pages 661--670, 2009.
[12]
T. Joachims. Optimizing search engines using clickthrough data. In KDD, pages 133--142, 2002.
[13]
R. Jones, B. Rey, O. Madani, and W. Greiner. Generating query substitutions. In WWW, pages 387--396, 2006.
[14]
R. Mihalcea and A. Csomai. Wikify!: linking documents to encyclopedic knowledge. In CIKM, pages 233--242, 2007.
[15]
F. Radlinski, R. Kleinberg, and T. Joachims. Learning Diverse Rankings with Multi-Armed Bandits. In ICML, 2008.
[16]
J. Shi and J. Malik. Normalized cuts and image segmentation. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 22:888--905, 2000.
[17]
R. W. White, M. Bilenko, and S. Cucerzan. Studying the use of popular destinations to enhance web search interaction. In SIGIR, pages 159--166, 2007.
[18]
R. W. White, P. Bailey, and L. Chen. Predicting user interests from contextual information. In SIGIR, pages 363--370, 2009.
[19]
I. H. Witten, G. W. Paynter, E. Frank, C. Gutwin, and C. G. Nevill-Manning. Kea: Practical automatic keyphrase extraction. In ACM DL, pages 254--255, 1999.
[20]
C. Zhai and J. La?erty. A study of smoothing methods for language models applied to Ad Hoc information retrieval. In SIGIR, pages 334--342, 2001.
[21]
http://en.wikipedia.org/wiki/Correlation
[22]
http://en.wikipedia.org/wiki/Kullback-Leibler_divergence
[23]
http://en.wikipedia.org/wiki/Jensen-Shannon_divergence
[24]
http://searchengineland.com/nielsen-netratings-august-2007-search-share-puts-google-on-top-microsoft-holding-gains-12243
[25]
http://www.accuracast.com/search-daily-news/seo-7471/us-search-engine-market-share-data-jan-2009/
[26]
http://www.comscore.com/Press_Events/Comunicados_de_prensa/2007/node_1285/Top_US_Search_Engines

Cited By

View all

Index Terms

  1. Actively predicting diverse search intent from user browsing behaviors

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '10: Proceedings of the 19th international conference on World wide web
    April 2010
    1407 pages
    ISBN:9781605587998
    DOI:10.1145/1772690

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 April 2010

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. contextual information
    2. diversification
    3. log mining
    4. search intent
    5. searchtrigger

    Qualifiers

    • Research-article

    Conference

    WWW '10
    WWW '10: The 19th International World Wide Web Conference
    April 26 - 30, 2010
    North Carolina, Raleigh, USA

    Acceptance Rates

    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)30
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 29 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    EPUB

    View this article in ePub.

    ePub

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media