skip to main content
10.1145/3121050.3121054acmconferencesArticle/Chapter ViewAbstractPublication PagesictirConference Proceedingsconference-collections
research-article

On Type-Aware Entity Retrieval

Published: 01 October 2017 Publication History

Abstract

Today, the practice of returning entities from a knowledge base in response to search queries has become widespread. One of the distinctive characteristics of entities is that they are typed, i.e., assigned to some hierarchically organized type system (type taxonomy). The primary objective of this paper is to gain a better understanding of how entity type information can be utilized in entity retrieval. We perform this investigation in an idealized "oracle" setting, assuming that we know the distribution of target types of the relevant entities for a given query. We perform a thorough analysis of three main aspects: (i) the choice of type taxonomy, (ii) the representation of hierarchical type information, and (iii) the combination of type-based and term-based similarity in the retrieval model. Using a standard entity search test collection based on DBpedia, we find that type information proves most useful when using large type taxonomies that provide very specific types. We provide further insights on the extensional coverage of entities and on the utility of target types.

References

[1]
Krisztian Balog, Marc Bron, and Maarten De Rijke. 2011. Query modeling for entity search based on terms, categories, and examples. ACM Trans. Inf. Syst. Vol. 29, 4 (2011), 22:1--22:31.
[2]
K. Balog, A. P. de Vries, P. Serdyukov, P. Thomas, and T. Westerveld 2010. Overview of the TREC 2009 Entity Track. In Proc. of TREC.
[3]
Krisztian Balog and Robert Neumayer 2012. Hierarchical target type identification for entity-oriented queries Proc. of CIKM. 2391--2394.
[4]
Krisztian Balog and Robert Neumayer 2013. A Test Collection for Entity Search in DBpedia. Proc. of SIGIR. 737--740.
[5]
Krisztian Balog, Pavel Serdyukov, and Arjen P. De Vries. 2012. Overview of the TREC 2011 Entity Track. In Proc. of TREC.
[6]
Marc Bron, Krisztian Balog, and Maarten de Rijke. 2010. Ranking Related Entities: Components and Analyses. Proc. of CIKM. 1079--1088.
[7]
Gianluca Demartini, Claudiu S. Firan, and Tereza Iofciu. 2008. Focused Access to XML Documents. Springer, Chapter L3S at INEX 2007, 252--263.
[8]
Gianluca Demartini, Claudiu S. Firan, Tereza Iofciu, Ralf Krestel, and Wolfgang Nejdl. 2010. Why finding entities in Wikipedia is difficult, sometimes. Information Retrieval Vol. 13, 5 (may 2010), 534--567. showISSN1386--4564
[9]
Gianluca Demartini, Tereza Iofciu, and Arjen P. De Vries. 2010. Overview of the INEX 2009 Entity Ranking Track. Focused Retrieval and Evaluation, and INEX. 254--264.
[10]
Gianluca Demartini, Tereza Iofciu, and Arjen P. De Vries. 2010. Overview of the INEX 2009 Entity Ranking Track. Focused Retrieval and Evaluation. 254--264.
[11]
Michael Fleischman and Eduard Hovy 2002. Fine Grained Classification of Named Entities. In Proc. of COLING. 1--7.
[12]
Marco Fossati, Dimitris Kontokostas, and Jens Lehmann. 2015. Unsupervised Learning of an Extensive and Usable Taxonomy for DBpedia Proc. of SEMANTICS. 177--184.
[13]
Aldo Gangemi, Andrea Giovanni Nuzzolese, Valentina Presutti, Francesco Draicchio, Alberto Musetti, and Paolo Ciancarini. 2012. Automatic Typing of DBpedia Entities. In Proc. of ISWC. 65--81.
[14]
Darıo Garigliotti, Faegheh Hasibi, and Krisztian Balog. 2017. Target Type Identification for Entity-Bearing Queries Proc. of SIGIR. 845--848.
[15]
Claudio Giuliano. 2009. Fine-grained Classification of Named Entities Exploiting Latent Semantic Kernels Proc. of CoNLL. 201--209.
[16]
Janne J"amsen, Turkka Nappila, and Paavo Arvola. 2008. Focused Access to XML Documents. Springer, Chapter Entity Ranking Based on Category Expansion, 264--278.
[17]
Rianne Kaptein and Jaap Kamps 2009. Finding Entities in Wikipedia using Links and Categories Advances in Focused Retrieval, INEX. 273--279.
[18]
Rianne Kaptein and Jaap Kamps 2013. Exploiting the category structure of Wikipedia for entity ranking. Artificial Intelligence Vol. 194 (jan 2013), 111--129. 00043702
[19]
Rianne Kaptein, Pavel Serdyukov, Arjen P. De Vries, and Jaap Kamps 2010. Entity ranking using Wikipedia as a pivot. In Proc. of CIKM. 69--78.
[20]
Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N. Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick van Kleef, Sören Auer, and Christian Bizer 2015. DBpedia - A large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, Vol. 6, 2 (2015), 167--195.
[21]
Thomas Lin, Mausam, and Oren Etzioni 2012. No Noun Phrase Left Behind: Detecting and Typing Unlinkable Entities Proc. of EMNLP-CoNLL. 893--903.
[22]
Xiao Ling and Daniel S. Weld 2012. Fine-grained Entity Recognition. In Proc. of AAAI. 94--100.
[23]
Vanessa Lopez, Christina Unger, Philipp Cimiano, and Enrico Motta 2013. Evaluating Question Answering over Linked Data. Web Semantics: Science, Services and Agents on the World Wide Web Vol. 21 (aug 2013), 3--13. 1570--8268
[24]
Peter Mika. 2013. Entity Search on the Web. In Proc. of WWW. 1231--1232.
[25]
Ndapandula Nakashole, Tomasz Tylenda, and Gerhard Weikum. 2013. Fine-grained Semantic Typing of Emerging Entities. Proc. of ACL. 1488--1497.
[26]
Robert Neumayer, Krisztian Balog, and Kjetil Nørvåg. 2012. On the modeling of entities for ad-hoc entity search in the web of data Proc. of ECIR. 133--145.
[27]
Robert Neumayer, Krisztian Balog, and Kjetil Nørvåg. 2012. When simple is (more than) good enough: effective semantic search with (almost) no semantics Proc. of ECIR. 540--543.
[28]
Andrea Giovanni Nuzzolese, Aldo Gangemi, Valentina Presutti, and Paolo Ciancarini 2012. Type inference through the analysis of Wikipedia links Proc. of LDOW.
[29]
Jovan Pehcevski, James A Thom, Anne-Marie Vercoustre, and Vladimir Naumovski 2010. Entity ranking in Wikipedia: utilising categories, links and topic difficulty prediction. Information Retrieval Vol. 13, 5 (2010), 568--600. showISSN13864564
[30]
Jeffrey Pound, Peter Mika, and Hugo Zaragoza. 2010. Ad-hoc object retrieval in the web of data. In Proc. of WWW. 771--780.
[31]
Altaf Rahman and Vincent Ng 2010. Inducing Fine-grained Semantic Classes via Hierarchical and Collective Classification Proc. of COLING. 931--939.
[32]
Hadas Raviv, David Carmel, and Oren Kurland. 2012. A Ranking Framework for Entity Oriented Search Using Markov Random Fields Proc. of JIWES. 1:1--1:6.
[33]
Uma Sawant and S Chakrabarti 2013. Learning Joint Query Interpretation and Response Ranking Proc. of WWW. 1099--1109.
[34]
Fabian M Suchanek, Gjergji Kasneci, and Gerhard Weikum. 2007. Yago: A Core of Semantic Knowledge. In Proc. of WWW. 697--706.
[35]
Alberto Tonon, Michele Catasta, Gianluca Demartini, Philippe Cudré-Mauroux, and Karl Aberer. 2013. TRank: Ranking Entity Types Using the Web of Data. Proc. of ISWC. 640--656.
[36]
David Vallet and Hugo Zaragoza 2008. Inferring the most important types of a query: a semantic approach Proc. of SIGIR. 857--858.
[37]
Anne-Marie Vercoustre, Jovan Pehcevski, and James A. Thom. 2008. Focused Access to XML Documents. Springer, Chapter Using Wikipedia Categories and Links in Entity Ranking, 321--335.
[38]
W. Weerkamp, K. Balog, and E. J. Meij 2009. A Generative Language Modeling Approach for Ranking Entities Advances in Focused Retrieval, INEX. 292--299.
[39]
Mohamed Amir Yosef, Sandro Bauer, Johannes Hoffart Marc Spaniol, and Gerhard Weikum 2012. HYENA: Hierarchical Type Classification for Entity Names Proc. of COLING. 1361--1370.
[40]
Jianhan Zhu, Dawei Song, and Stefan Rüger 2008. Focused Access to XML Documents. Springer, Chapter Integrating Document Features for Entity Ranking, 336--347.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICTIR '17: Proceedings of the ACM SIGIR International Conference on Theory of Information Retrieval
October 2017
348 pages
ISBN:9781450344906
DOI:10.1145/3121050
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. entity retrieval
  2. entity types
  3. semantic search

Qualifiers

  • Research-article

Conference

ICTIR '17
Sponsor:

Acceptance Rates

ICTIR '17 Paper Acceptance Rate 27 of 54 submissions, 50%;
Overall Acceptance Rate 235 of 527 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media