skip to main content
research-article

Using the Crowd to Improve Search Result Ranking and the Search Experience

Published: 12 July 2016 Publication History

Abstract

Despite technological advances, algorithmic search systems still have difficulty with complex or subtle information needs. For example, scenarios requiring deep semantic interpretation are a challenge for computers. People, on the other hand, are well suited to solving such problems. As a result, there is an opportunity for humans and computers to collaborate during the course of a search in a way that takes advantage of the unique abilities of each. While search tools that rely on human intervention will never be able to respond as quickly as current search engines do, recent research suggests that there are scenarios where a search engine could take more time if it resulted in a much better experience. This article explores how crowdsourcing can be used at query time to augment key stages of the search pipeline. We first explore the use of crowdsourcing to improve search result ranking. When the crowd is used to replace or augment traditional retrieval components such as query expansion and relevance scoring, we find that we can increase robustness against failure for query expansion and improve overall precision for results filtering. However, the gains that we observe are limited and unlikely to make up for the extra cost and time that the crowd requires. We then explore ways to incorporate the crowd into the search process that more drastically alter the overall experience. We find that using crowd workers to support rich query understanding and result processing appears to be a more worthwhile way to make use of the crowd during search. Our results confirm that crowdsourcing can positively impact the search experience but suggest that significant changes to the search process may be required for crowdsourcing to fulfill its potential in search systems.

References

[1]
Omar Alonso, Daniel E. Rose, and Benjamin Stewart. 2008. Crowdsourcing for relevance evaluation. In ACM SIGIR Forum, Vol. 42. 9.
[2]
Michael Bendersky and W. Bruce Croft. 2008. Discovering key concepts in verbose queries. In Proc. SIGIR. ACM, New York, NY, 491--498.
[3]
Michael Bendersky and W. Bruce Croft. 2009. Analysis of long queries in a large scale search log. In Proc. WSCD. ACM, New York, NY, 8--14.
[4]
Michael S. Bernstein, Greg Little, Robert C. Miller, Björn Hartmann, Mark S. Ackerman, David R. Karger, David Crowell, and Katrina Panovich. 2010a. Soylent: A word processor with a crowd inside. In Proc. UIST. ACM, New York, NY, 313--322.
[5]
Michael S. Bernstein, Jaime Teevan, Susan Dumais, Dan Liebling, and Eric Horvitz. 2010b. Direct answers for search queries in the long tail. In Proc. CHI. 237--246.
[6]
Alessandro Bozzon, Marco Brambilla, and Stefano Ceri. 2012. Answering search queries with crowdsearcher. In Proc. WWW. ACM, New York, NY, 1009--1018.
[7]
Claudio Carpineto, Stanislaw Osiński, Giovanni Romano, and Dawid Weiss. 2009. A survey of web clustering engines. ACM Comp. Surv. 41, 3, Article 17 (2009), 38 pages.
[8]
Olivier Chapelle, Donald Metlzer, Ya Zhang, and Pierre Grinspan. 2009. Expected reciprocal rank for graded relevance. In Proc. CIKM. 621--630.
[9]
Xi Chen, Paul N. Bennett, Kevyn Collins-Thompson, and Eric Horvitz. 2013. Pairwise ranking aggregation in a crowdsourced setting. In Proc. WSDM. 193--202.
[10]
Kevyn Collins-Thompson. 2009. Reducing the risk of query expansion via robust constrained optimization. In Proc. CIKM. 837--846.
[11]
Kevyn Collins-Thompson and Jamie Callan. 2005. Query expansion using random walk models. In Proc. CIKM. 704--711.
[12]
Gordon V. Cormack, Mark D. Smucker, and Charles L. A. Clarke. 2011. Efficient and effective spam filtering and re-ranking for large web datasets. Inf. Retrieval 14, 5 (2011), 441--465.
[13]
Daniel Crabtree. 2007. Exploiting underrepresented query aspects for automatic query expansion categories and subject descriptors. In KDD. 191--200.
[14]
Gianluca Demartini, Beth Trushkowsky, Tim Kraska, and Michael J. Franklin. 2013. CrowdQ: Crowdsourced query understanding. In Proc. CIDR.
[15]
Susan T. Dumais. 2013. Task-based search: A search engine perspective. Talk at NSF Task-Based Information Search Systems Workshop. (March 14--15, 2013). Retrieved from http://bit.ly/15rK5tD.
[16]
Michael J. Franklin, Donald Kossmann, Tim Kraska, Sukriti Ramesh, and Reynold Xin. 2011. CrowdDB: Answering queries with crowdsourcing. In Proc. SIGMOD. ACM, New York, NY, 61--72.
[17]
Michael J. Franklin, Beth Trushkowsky, Purnamrita Sarkar, and Tim Kraska. 2013. Crowdsourced enumeration queries. In Proc. ICDE. IEEE Computer Society, Washington, DC, 673--684.
[18]
Brent Hecht, Jaime Teevan, Meredith Ringel Morris, and Daniel J Liebling. 2012. SearchBuddies: Bringing search engines into the conversation. In Proc. ICWSM, Vol. 12. 138--145.
[19]
Gary Hsieh and Scott Counts. 2009. Mimir: A market-based real-time question and answer service. In Proc. CHI. 769--778.
[20]
Panagiotis G. Ipeirotis, Foster Provost, and Jing Wang. 2010. Quality management on amazon mechanical turk. In Proc. HCOMP. ACM, New York, NY, 64--67.
[21]
Bernard J. Jansen, Amanda Spink, and Tefko Saracevic. 2000. Real life, real users, and real needs: A study and analysis of user queries on the web. IP&M 36, 2 (2000), 207--227.
[22]
Jin-Woo Jeong, Meredith R. Morris, Jaime Teevan, and Daniel Liebling. 2013. A crowd-powered socially embedded search engine. In Proc. ICWSM.
[23]
In-Ho Kang and Gil Chang Kim. 2003. Query type classification for web document retrieval. In Proc. SIGIR. ACM, New York, NY, 64--71.
[24]
Gabriella Kazai, Jaap Kamps, Marijn Koolen, and Natasa Milic-Frayling. 2011. Crowdsourcing for book search evaluation: Impact of hit design on comparative system ranking. In Proc. SIGIR. ACM, New York, NY, 205--214.
[25]
Giridhar Kumaran and Vitor R. Carvalho. 2009. Reducing long queries using query quality predictors. In Proc. SIGIR. ACM, New York, NY, 564--571.
[26]
Walter S. Lasecki and Jeffrey P. Bigham. 2014. Interactive crowds: Real-time crowdsourcing and crowd agents. In Handbook of Human Computation.
[27]
Victor Lavrenko and W. Bruce Croft. 2001. Relevance based language models. In Proc. SIGIR. ACM, New York, NY, 120--127.
[28]
Edith Law and H Zhang. 2011. Towards large-scale collaborative planning: Answering high-level search queries using human computation. In Proc. AAAI.
[29]
Uichin Lee, Jihyoung Kim, Eunhee Yi, Juyup Sung, and Mario Gerla. 2013. Analyzing crowd workers in mobile pay-for-answer Q&A. In Proc. CHI. 533--542.
[30]
Adam Marcus, Eugene Wu, Samuel Madden, and Robert C. Miller. 2011. Crowdsourced databases: Query processing with people. In Proc. CIDR. CIDR, 211--214.
[31]
Winter Mason and Duncan J. Watts. 2009. Financial incentives and the “performance of crowds”. In Proc. HCOMP. ACM, New York, NY, 77--85.
[32]
Aditya G. Parameswaran, Ming Han Teh, Hector Garcia-Molina, and Jennifer Widom. 2013. DataSift: An expressive and accurate crowd-powered search toolkit. In Proc. HCOMP.
[33]
Matthew Richardson and Ryen W. White. 2011. Supporting synchronous social q&a throughout the question lifecycle. In Proc. WWW. 755--764.
[34]
Nikos Sarkas, Stelios Paparizos, and Panayiotis Tsaparas. 2010. Structured annotations of web queries. In Proc. SIGMOD. ACM, New York, NY, 771--782.
[35]
Eric Schurman and Jake Brutlag. 2009. Performance related changes and their user impact. Velocity. (2009). http://oreil.ly/fTmYwz.
[36]
Jaime Teevan, Kevyn Collins-Thompson, Ryen White, Susan Dumais, and Yubin Kim. 2013. Slow search or: How search engines can learn to stop hurrying and take their time. In Proc. HCIR. ACM, New York, NY.
[37]
Jaime Teevan, Kevyn Collins-Thompson, Ryen W. White, and Susan Dumais. 2014. Slow search. Commun. ACM 57, 8 (2014), 36--38.
[38]
Jaime Teevan, Daniel J. Liebling, and Walter S. Lasecki. 2014. Selfsourcing personal tasks. In Proc. CHI EA. 2527--2532.
[39]
Tingxin Yan, Vikas Kumar, and Deepak Ganesan. 2010. CrowdSearch: Exploiting crowds for accurate real-time image search on mobile phones. In Proc. MobiSys. ACM, New York, NY, 77--90.
[40]
Haoqi Zhang, Edith Law, Rob Miller, Krzysztof Gajos, David Parkes, and Eric Horvitz. 2012. Human computation tasks with global constraints. In Proc. CHI. ACM, New York, NY, 217--226.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 7, Issue 4
Special Issue on Crowd in Intelligent Systems, Research Note/Short Paper and Regular Papers
July 2016
498 pages
ISSN:2157-6904
EISSN:2157-6912
DOI:10.1145/2906145
  • Editor:
  • Yu Zheng
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 July 2016
Accepted: 01 November 2015
Revised: 01 August 2015
Received: 01 February 2015
Published in TIST Volume 7, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Slow search
  2. crowdsourcing
  3. information retrieval

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media