skip to main content
10.1145/1935826.1935839acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Who uses web search for what: and how

Published: 09 February 2011 Publication History

Abstract

We analyze a large query log of 2.3 million anonymous registered users from a web-scale U.S. search engine in order to jointly analyze their on-line behavior in terms of who they might be (demographics), what they search for (query topics), and how they search (session analysis). We examine basic demographics from registration information provided by the users, augmented with U.S. census data, analyze basic session statistics, classify queries into types (navigational, informational, transactional) based on click entropy, classify queries into topic categories, and cluster users based on the queries they issued. We then examine the resulting clusters in terms of demographics and search behavior. Our analysis of the data suggests that there are important differences in search behavior across different demographic groups in terms of the topics they search for, and how they search (e.g., white conservatives are those likely to have voted republican, mostly white males, who search for business, home, and gardening related topics; Baby Boomers tend to be primarily interested in Finance and a large fraction of their sessions consist of simple navigational queries related to online banking, etc.). Finally, we examine regional search differences, which seem to correlate with differences in local industries (e.g., gambling related queries are highest in Las Vegas and lowest in Salt Lake City; searches related to actors are about three times higher in L.A. than in any other region).

Supplementary Material

JPG File (wsdm2011_weber_wuw_01.jpg)
MP4 File (wsdm2011_weber_wuw_01.mp4)

References

[1]
R. Baeza-Yates, L. Calderón-Benavides, and C. González-Caro. The intention behind web queries. In String Processing and Information Retrieval (SPIRE), pages 98--109, 2006.
[2]
S. M. Beitzel, E. C. Jensen, A. Chowdhury, D. Grossman, and O. Frieder. Hourly analysis of a very large topically categorized web query log. In Conference on Research and development in information retrieval (SIGIR), pages 321--328, 2004.
[3]
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002.
[4]
H. Bruce, W. Jones, and S. Dumais. Keeping and re-finding information on the web: What do people do and what do they need? Proceedings of the American Society for Information Science and Technology (JASIST), 41(1):129--137, 2004.
[5]
ESRI. Tapestry segmentation, 2010. http://www.esri.com/library/fliers/pdfs/tapestry_segmentation.pdf.
[6]
experian. Mosaic united kingdom - the consumer classification for the uk, 2009. http://www.ccr.co.uk/pdf/MOSAICGuide.pdf.
[7]
H. Feild, J. Allan, and R. Jones. Predicting searcher frustration. In Conference on Research and development in information retrieval (SIGIR), pages 34--41, 2010.
[8]
Q. Guo and E. Agichtein. Ready to buy or just browsing?: detecting web searcher goals from interaction data. In Conference on Research and development in information retrieval (SIGIR), pages 130--137, 2010.
[9]
Q. Guo, R. W. White, S. T. Dumais, J. Wang, and B. Anderson. Predicting query performance using query, result, and user interaction features. In Conference on Adaptivity, Personalization and Fusion of Heterogeneous Information (RIAO), 2010.
[10]
G. Hotchkiss. Inside the mind of the searcher. Report, Enquiro, March 2004.
[11]
G. Hotchkiss. Search engine usage in north america. Report, Enquiro, April 2004.
[12]
J. Hu, H. Zeng, H. Li, C. Niu, and Z. Chen. Demographic prediction based on user's browsing behavior. In Conference on World Wide Web (WWW), pages 151--160, 2007
[13]
B. Jansen, D. Booth, and A. Spink. Determining the informational, navigational, and transactional intent of Web queries. Information Processing & Management (IPM), 44(3):1251--1266, 2008.
[14]
R. Jones and K. L. Klinkner. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In Conference on Information and knowledge management (CIKM), pages 699--708, 2008.
[15]
T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu. A local search approximation algorithm for k-means clustering. In Symposium on Computational geometry (SOCG), pages 10--18, 2002.
[16]
T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, R. Silverman, and A. Wu. An efficient k-means clustering algorithm: Analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 24(7):881--892, 2002.
[17]
R. Kumar and A. Tomkins. A characterization of online browsing behavior. In Conference on World wide web (WWW), pages 561--570, 2010.
[18]
E. Manavoglu, D. Pavlov, and C. L. Giles. Probabilistic user behavior models. In International Conference on Data Mining (ICDM), page 203, 2003.
[19]
J. Miller. Butterballs or cheese balls, an online barometer, 2009. http://www.nytimes.com/2009/11/26/dining/26search.html.
[20]
B. Piwowarski, G. Dupret, and R. Jones. Mining user web search activity with layered bayesian networks or how to capture a click in its context. In Conference on Web Search and Data Mining (WSDM), pages 162--171, 2009.
[21]
. Pu, S. Chuang, and C. Yang. Subject categorization of query terms for exploring Web users' search interests. Journal of the American Society for Information Science and Technology (JASIST), 53(8):617--630, 2002.
[22]
J. Teevan, E. Adar, R. Jones, and M. A. S. Potts. Information re-retrieval: repeat queries in yahoo's logs. In Conference on Research and development in information retrieval (SIGIR), pages 151--158, 2007.
[23]
E. Voorhees. The philosophy of information retrieval evaluation. In Workshop of the Cross-Language Evaluation Forum (CLEF), pages 143--170, 2002.
[24]
K. Wang, N. Gloy, and X. Li. Inferring search behaviors using partially observable markov (pom) model. In Conference on Web search and data mining (WSDM), pages 211--220, New York, NY, USA, 2010. ACM.
[25]
I. Weber and C. Castillo. The demographics of web search. In Conference on Research and development in information retrieval (SIGIR), pages x--x, 2010.
[26]
I. Weber and A. Jaimes. Demographic information flows. In Conference on Information and knowledge management (CIKM), pages 1521--1524, 2010.
[27]
R. W. White and S. M. Drucker. Investigating behavioral variability in web search. In Conference on World Wide Web (WWW), pages 21--30, 2007.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining
February 2011
870 pages
ISBN:9781450304931
DOI:10.1145/1935826
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 February 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. demographics
  2. query logs
  3. session analysis
  4. topic classification

Qualifiers

  • Research-article

Conference

Acceptance Rates

WSDM '11 Paper Acceptance Rate 83 of 372 submissions, 22%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)5
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media