skip to main content
research-article

Algebraic Structures for Capturing the Provenance of SPARQL Queries

Published: 12 February 2016 Publication History

Abstract

The evaluation of SPARQL algebra queries on various kinds of annotated RDF graphs can be seen as a particular case of the evaluation of these queries on RDF graphs annotated with elements of so-called spm-semirings. Spm-semirings extend semirings, used for representing the provenance of positive relational algebra queries on annotated relational data, with a new operator to capture the semantics of the non-monotone SPARQL operators. Furthermore, spm-semiring-based annotations ensure that desired SPARQL query equivalences hold when querying annotated RDF. In this work, in addition to introducing spm-semirings, we study their properties and provide an alternative characterization of these structures in terms of semirings with an embedded boolean algebra (or seba-structure for short). This characterization allows us to construct spm-semirings and identify a universal object in the class of spm-semirings. Finally, we show that this universal object provides a provenance representation of poly-sized overhead and can be used to evaluate SPARQL queries on arbitrary spm-semiring-annotated RDF graphs.

References

[1]
Serge Abiteboul, Richard Hull, and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.
[2]
Yael Amsterdamer, Daniel Deutch, and Val Tannen. 2011a. On the limitations of provenance for queries with difference. In Proceedings of the 3rd USENIX Workshop on the Theory and Practice of Provenance (TaPP’11). USENIX Association.
[3]
Yael Amsterdamer, Daniel Deutch, and Val Tannen. 2011b. Provenance for aggregate queries. In Proceedings of the 30th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’11). ACM, 153--164.
[4]
Renzo Angles and Claudio Gutierrez. 2008. The expressive power of SPARQL. In International Semantic Web Conference (Lecture Notes in Computer Science), Vol. 5318. Springer, 114--129.
[5]
Peter Buneman, James Cheney, and Stijn Vansummeren. 2008. On the expressiveness of implicit provenance in query and update languages. ACM Transactions on Database Systems 33, 4, Article 28 (2008), 47 pages.
[6]
Peter Buneman and Egor V. Kostylev. 2011. Annotation algebras for RDFS. In Proceedings of the 2nd International Workshop on the role of Semantic Web in Provenance Management. CEUR Workshop Proceedings, Article 1, 6 pages. ceur-ws.org/Vol-670/paper_4.pdf.
[7]
Stanley Burris and H. P. Sankappanavar. 1981. A Course in Universal Algebra. Number 78 in Graduate Texts in Mathematics. Springer-Verlag.
[8]
Jeremy Carroll, Christian Bizer, Pat Hayes, and Patrick Stickler. 2005. Named graphs. Web Semantics: Science, Services and Agents on the World Wide Web 3, 4 (2005), 247--267.
[9]
James Cheney, Laura Chiticariu, and Wang-Chiew Tan. 2009. Provenance in databases: Why, how, and where. Found Trends Databases 1, 4 (2009), 379--474.
[10]
Carlos Viegas Damásio, Anastasia Analyti, and Grigoris Antoniou. 2012. Provenance for SPARQL queries. In Proceedings of the 11th International Conference on the Semantic Web (ISWC’12). Lecture Notes in Computer Science, Vol. 7649. Springer, 625--640.
[11]
Renata Dividino, Sergej Sizov, Steffen Staab, and Bernhard Schueler. 2009. Querying for provenance, trust, uncertainty and other meta knowledge in RDF. Web Semantics: Science, Services and Agents on the World Wide Web 7, 3 (2009), 204--219.
[12]
J. Nathan Foster, Todd J. Green, and Val Tannen. 2008. Annotated XML: Queries and provenance. In Proceedings of the 27th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’08). ACM, 271--280.
[13]
Michael R. Garey and David S. Johnson. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman.
[14]
Floris Geerts, Grigoris Karvounarakis, Vassilis Christophides, and Irini Fundulaki. 2013. Algebraic structures for capturing the provenance of SPARQL queries. In Proceedings of the 16th International Conference on Database Theory (ICDT’13). ACM, 153--164.
[15]
Floris Geerts and Antonella Poggi. 2010. On database query languages for K-relations. Journal of Applied Logic 8, 2 (2010), 173--185.
[16]
Boris Glavic and Gustavo Alonso. 2009. Perm: Processing provenance and data on the same data model through query rewriting. In Proceedings of the 25th International Conference on Data Engineering (ICDE’09). IEEE, 174--185.
[17]
Todd J. Green. 2011. Containment of conjunctive queries on annotated relations. Theory of Computing Systems 49, 2 (2011), 429--459.
[18]
Todd J. Green, Gregory Karvounarakis, and Val Tannen. 2007. Provenance semirings. In Proceedings of the 26th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS’07). ACM, 31--40.
[19]
Steve Harris and Andy Seaborne. 2013. SPARQL 1.1 Query Language. Retrieved http://www.w3.org/TR/sparql11-query/.
[20]
Olaf Hartig. 2008. Specification for tSPARQL. (2008). http://trdf.sf.net/documents/tsparql.pdf.
[21]
Olaf Hartig. 2009. Querying trust in rdf data with tSPARQL. In Proceedings of the 6th European Semantic Web Conference on The Semantic Web: Research and Applications (ESWC’09). Lecture Notes in Computer Science, Vol. 5554. Springer, 5--20.
[22]
Hai Huang and Chengfei Liu. 2009. Query evaluation on probabilistic RDF databases. In Proceedings of the 10th International Conference on Web Information Systems Engineering (WISE’09) (Lecture Notes in Computer Science), Vol. 5802. Springer, 307--320.
[23]
Edward V. Huntington. 1933a. Boolean algebra. A correction. Transactions of the American Mathematical Society 35, 2 (1933), pp. 557--558.
[24]
Edward V. Huntington. 1933b. New sets of independent postulates for the algebra of logic, with special reference to Whitehead and Russell. Transactions of the American Mathematical Society 35 (1933), 274--304.
[25]
Grigoris Karvounarakis, Todd J. Green, Zachary G. Ives, and Val Tannen. 2013. Collaborative data sharing via update exchange and provenance. ACM Transactions on Database Systems 38, 3, Article 19 (2013), 42 pages.
[26]
Grigoris Karvounarakis, Zachary G. Ives, and Val Tannen. 2010. Querying data provenance. In Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD’10). ACM, 951--962.
[27]
Anastasios Kementsietsidis, Enela Pema, and Wang-Chiew Tan. 2014. Query answering over incomplete and uncertain RDF. In Proceedings of the 17th International Workshop on the Web and Databases (WebDB’14).
[28]
Frank Manola, Eric Miller, and Brian McBride. 2004. RDF Primer. Retrieved from www.w3.org/TR/rdf-primer.
[29]
Mauro Mazzieri and Aldo Franco Dragoni. 2008. A fuzzy semantics for the resource description framework. In SWC International Workshops Uncertainty Reasoning for the Semantic Web I (URSW). Lecture Notes in Computer Science, Vol. 5327. Springer, 244--261.
[30]
William McCune. 2005--2010. Prover9 and Mace4. Retrieved from http://www.cs.unm.edu/∼mccune/prover9/.
[31]
Jorge Pérez, Marcelo Arenas, and Claudio Gutierrez. 2009. Semantics and complexity of SPARQL. ACM Transactions on Database Systems 34, 3, Article 16 (2009), 45 pages.
[32]
Eric Prud’hommeaux and Andy Seaborne. 2008. SPARQL Query Language for RDF. www.w3.org/TR/rdf-sparql-query. (January 2008).
[33]
Michael Schmidt, Michael Meier, and Georg Lausen. 2010. Foundations of SPARQL query optimization. In Proceedings of the 13th International Conference on Database Theory (ICDT’10). ACM, 4--33.
[34]
Umberto Straccia. 2013. Foundations of Fuzzy Logic and Semantic Web Languages. Chapman and Hall/CRC.
[35]
Yannis Theoharis, Irini Fundulaki, Grigoris Karvounarakis, and Vassilis Christophides. 2011. On provenance of queries on semantic web data. IEEE Internet Computing 15, 1 (2011), 31--39.
[36]
Octavian Udrea, Diego Reforgiato Recupero, and V. S. Subrahmanian. 2010. Annotated RDF. ACM Transactions on Computational Logic 11, 2, Article 10 (2010), 41 pages.
[37]
Moshe Y. Vardi. 1982. The complexity of relational query languages (extended abstract). In Proceedings of the 14th Annual ACM Symposium on Theory of Computing (STOC’82). ACM, 137--146.
[38]
Antoine Zimmermann, Nuno Lopes, Axel Polleres, and Umberto Straccia. 2012. A general framework for representing, reasoning and querying with annotated Semantic Web data. Journal of Web Semantics 11 (2012), 72--95.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of the ACM
Journal of the ACM  Volume 63, Issue 1
March 2016
353 pages
ISSN:0004-5411
EISSN:1557-735X
DOI:10.1145/2893450
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 February 2016
Accepted: 01 July 2015
Revised: 01 July 2015
Received: 01 June 2015
Published in JACM Volume 63, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Algebraic structures
  2. RDF
  3. SPARQL
  4. annotations
  5. provenance models
  6. query languages

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)30
  • Downloads (Last 6 weeks)4
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media