skip to main content
10.1145/3357384.3358087acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Cluster-Based Focused Retrieval

Published: 03 November 2019 Publication History

Abstract

The focused retrieval task is to rank documents' passages by their presumed relevance to a query. Inspired by work on cluster-based document retrieval, we present a novel cluster-based focused retrieval method. The method is based on ranking clusters of similar passages using a learning-to-rank approach and transforming the cluster ranking to passage ranking. Empirical evaluation demonstrates the clear merits of the method.

References

[1]
Paavo Arvola, Shlomo Geva, Jaap Kamps, Ralf Schenkel, Andrew Trotman, and Johanna Vainio. 2011. Overview of the INEX 2010 ad hoc track. In Comparative Evaluation of Focused Retrieval. 1--32.
[2]
Michael Bendersky, W Bruce Croft, and Yanlei Diao. 2011. Quality-biased ranking of web documents. In Proc. of WSDM. 95--104.
[3]
David Buffoni, Nicolas Usunier, and Patrick Gallinari. 2010. Lip6 at INEX: OWPC for ad hoc track. In Focused Retrieval and Evaluation . 59--69.
[4]
James P. Callan. 1994. Passage-Level Evidence in Document Retrieval. In Proc. of SIGIR. 302--301.
[5]
Ruey-Cheng Chen, Evi Yulianti, Mark Sanderson, and W Bruce Croft. 2017. On the Benefit of Incorporating External Features in a Neural Architecture for Answer Sentence Selection. In Proc. of SIGIR. 1017--1020.
[6]
Daniel Cohen and W Bruce Croft. 2016. End to end long short term memory networks for non-factoid question answering. In Proc. of ICTIR. 143--146.
[7]
Shlomo Geva, Jaap Kamps, Miro Lethonen, Ralf Schenkel, James A Thom, and Andrew Trotman. 2010. Overview of the INEX 2009 ad hoc track. In Focused retrieval and evaluation . 4--25.
[8]
Nick Jardine and C. J. van Rijsbergen. 1971. The use of hierarchic clustering in information retrieval. Information storage and retrieval, Vol. 7, 5 (1971), 217--240.
[9]
Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proc. of KDD. 217--226.
[10]
Oren Kurland. 2009. Re-ranking search results using language models of query-specific clusters. Information Retrieval, Vol. 12, 4 (2009), 437--460.
[11]
Oren Kurland and Carmel Domshlak. 2008. A rank-aggregation approach to searching for optimal query-specific clusters. In Proc. of SIGIR. 547--554.
[12]
Oren Kurland and Eyal Krikon. 2011. The opposite of smoothing: a language model approach to ranking query-specific document clusters. Journal of Artificial Intelligence Research, Vol. 41 (2011), 367--395.
[13]
Oren Kurland and Lillian Lee. 2004. Corpus structure, language models, and ad hoc information retrieval. In Proc. of SIGIR . 194--201.
[14]
Xiaoyong Liu and W Bruce Croft. 2004. Cluster-based retrieval using language models. In Proc. of SIGIR. 186--193.
[15]
Xiaoyong Liu and W Bruce Croft. 2006. Experiments on retrieval of optimal clusters . Technical Report. Technical Report IR-478, Center for Intelligent Information Retrieval (CIIR), University of Massachusetts.
[16]
Xiaoyong Liu and W Bruce Croft. 2008. Evaluating text representations for retrieval of the best group of documents. In Proc. of ECIR . 454--462.
[17]
Vanessa Graham Murdock. 2006. Aspects of sentence retrieval . Ph.D. Dissertation. University of Massachusetts Amherst.
[18]
Fiana Raiber and Oren Kurland. 2013. Ranking document clusters using markov random fields. In Proc. of SIGIR . 333--342.
[19]
Tetsuya Sakai, Toshihiko Manabe, and Makoto Koyama. 2005. Flexible pseudo-relevance feedback via selective sampling. TALIP, Vol. 4, 2 (2005), 111--135.
[20]
Gerard Salton, James Allan, and Chris Buckley. 1993. Approaches to passage retrieval in full text information systems. In Proc. of SIGIR . 49--58.
[21]
Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proc. of SIGIR. 373--382.
[22]
Eilon Sheetrit, Anna Shtok, Oren Kurland, and Igal Shprincis. 2018. Testing the Cluster Hypothesis with Focused and Graded Relevance Judgments. In Proc. of SIGIR . 1173--1176.
[23]
Ian Soboroff. 2004. Overview of the TREC 2004 Novelty Track. In Proc. of TREC .
[24]
Ian Soboroff and Donna Harman. 2003. Overview of the TREC 2003 Novelty Track. In Proc. of TREC. 38--53.
[25]
Anastasios Tombros, Robert Villa, and C. J. Van Rijsbergen. 2002. The effectiveness of query-specific hierarchic clustering in information retrieval. Information processing & management, Vol. 38, 4 (2002), 559--582.
[26]
Ellen M. Voorhees. 1985. The cluster hypothesis revisited. In Proc. of SIGIR. 188--196.
[27]
Liu Yang, Qingyao Ai, Jiafeng Guo, and W Bruce Croft. 2016a. aNMM: Ranking short answer texts with attention-based neural matching model. In Proc. of CIKM. 287--296.
[28]
Liu Yang, Qingyao Ai, Damiano Spina, Ruey-Cheng Chen, Liang Pang, W Bruce Croft, Jiafeng Guo, and Falk Scholer. 2016b. Beyond factoid QA: Effective methods for non-factoid answer sentence retrieval. In Proc. of ECIR. 115--128.
[29]
Chengxiang Zhai and John Lafferty. 2001. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proc. of SIGIR. 334--342.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cluster ranking
  2. focused retrieval
  3. passage retrieval

Qualifiers

  • Short-paper

Funding Sources

  • German Research Foundation (DFG) via the German-Israeli Project Cooperation (DIP)

Conference

CIKM '19
Sponsor:

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media