On the modeling of entities for ad-hoc entity search in the web of data
European Conference on Information Retrieval, 2012•Springer
Abstract The Web of Data describes objects, entities, or “things” in terms of their attributes
and their relationships, using RDF statements. There is a need to make this wealth of
knowledge easily accessible by means of keyword search. Despite recent research efforts in
this direction, there is a lack of understanding of how structured semantic data is best
represented for text-based entity retrieval. The task we are addressing in this paper is ad-
hoc entity search: the retrieval of RDF resources that represent an entity described in the …
and their relationships, using RDF statements. There is a need to make this wealth of
knowledge easily accessible by means of keyword search. Despite recent research efforts in
this direction, there is a lack of understanding of how structured semantic data is best
represented for text-based entity retrieval. The task we are addressing in this paper is ad-
hoc entity search: the retrieval of RDF resources that represent an entity described in the …
Abstract
The Web of Data describes objects, entities, or “things” in terms of their attributes and their relationships, using RDF statements. There is a need to make this wealth of knowledge easily accessible by means of keyword search. Despite recent research efforts in this direction, there is a lack of understanding of how structured semantic data is best represented for text-based entity retrieval. The task we are addressing in this paper is ad-hoc entity search: the retrieval of RDF resources that represent an entity described in the keyword query. We build upon and formalise existing entity modeling approaches within a generative language modeling framework, and compare them experimentally using a standard test collection, provided by the Semantic Search Challenge evaluation series. We show that these models outperform the current state-of-the-art in terms of retrieval effectiveness, however, this is done at the cost of abandoning a large part of the semantics behind the data. We propose a novel entity model capable of preserving the semantics associated with entities, without sacrificing retrieval effectiveness.
Springer