skip to main content
10.1145/3587259.3627565acmconferencesArticle/Chapter ViewAbstractPublication Pagesk-capConference Proceedingsconference-collections
research-article

XSD2SHACL: Capturing RDF Constraints from XML Schema

Published: 05 December 2023 Publication History

Abstract

SHACL shapes describe the constraints of RDF subgraphs which are constructed from heterogeneous data, such as RDBs, JSONs, XMLs, etc. These heterogeneous data often already have constraints defined in their schemas, e.g., JSON Schema for JSON or XSD for XML, but this information is ignored when the RDF graph is constructed, as there are currently not many works that translate such schemas into SHACL. In this paper, we focus on the incorporation of XSD constraints for XML data sources in SHACL shapes. We define a translation from XSD to SHACL, and provide a corresponding system. We compare our solution with XMLSchema2ShEx which translates XSD constraints to ShEx and validate our solution against two use cases. Our solution provides the desired SHACL shapes in a reasonable time. This allows us to automatically derive SHACL shapes for some original raw data without any manual effort.

References

[1]
Marcelo Arenas, Alexandre Bertails, Eric Prud’hommeaux, and Juan Sequeda. 2012. A Direct Mapping of Relational Data to RDF. Recommendation. World Wide Web Consortium (W3C). http://www.w3.org/TR/rdb-direct-mapping/
[2]
Andrea Cimmino, Alba Fernández-Izquierdo, and Raúl García-Castro. 2020. Astrea: Automatic Generation of SHACL Shapes from Ontologies. In European Semantic Web Conference (ESWC). Springer, 497–513. https://doi.org/10.1007/978-3-030-49461-2_29
[3]
Souripriya Das, Seema Sundara, and Richard Cyganiak. 2012. R2RML: RDB to RDF Mapping Language. Recommendation. W3C. http://www.w3.org/TR/r2rml/
[4]
Thomas Delva, Birte De Smedt, Sitt Min Oo, Dylan Van Assche, Sven Lieber, and Anastasia Dimou. 2021. RML2SHACL: RDF Generation Taking Shape. In Proceedings of the 11th on Knowledge Capture Conference. ACM, New York, NY, USA, 153–160. https://doi.org/10.1145/3460210.3493562
[5]
Anastasia Dimou, Miel Vander Sande, Pieter Colpaert, Ruben Verborgh, Erik Mannens, and Rik Van de Walle. 2014. RML: A Generic Language for Integrated RDF Mappings of Heterogeneous Data. In Proceedings of the 7th Workshop on Linked Data on the Web, Vol. 1184. CEUR Workshop Proceedings. http://ceur-ws.org/Vol-1184/ldow2014_paper_01.pdf
[6]
David Fallside and Priscilla Walmsley. 2004. XML Schema Part 0: Primer Second Edition. Recommendation. W3C. https://www.w3.org/TR/xmlschema-0/
[7]
Rémi Felin, Catherine Faron, and Andrea G. B. Tettamanzi. 2023. A Framework to Include and Exploit Probabilistic Information in SHACL Validation Reports. In The Semantic Web, Vol. 13870. Springer, 91–104. https://doi.org/10.1007/978-3-031-33455-9_6
[8]
Daniel Fernández-Álvarez, H. García-González, Johannes Frey, S. Hellmann, and Jose Emilio Labra Gayo. 2018. Inference of Latent Shape Expressions Associated to DBpedia Ontology. In Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Vol. 2195. CEUR Workshop Proceedings, 52–66. https://ceur-ws.org/Vol-2195/research_paper_2.pdf
[9]
Daniel Fernandez-Álvarez, Jose Emilio Labra-Gayo, and Daniel Gayo-Avello. 2022. Automatic extraction of shapes using sheXer. Knowledge-Based Systems 238, C (Feb. 2022), 107975. https://doi.org/10.1016/j.knosys.2021.107975
[10]
Herminio Garcia-Gonzalez and Jose Emilio Labra-Gayo. 2020. XMLSchema2ShEx: Converting XML validation to RDF validation. Semantic Web 11, 2 (2020), 235–253. https://doi.org/10.3233/SW-180329
[11]
Dave Lewis Harshvardhan J. Pandit, Declan O’Sullivan. 2018. Using Ontology Design Patterns to Define SHACL Shapes. In 9th Workshop on Ontology Design and Patterns (WOP2018), Vol. 2195. CEUR Workshop Proceedings, Monterey California, USA, 67–71.
[12]
ISO/IEC 9075-1:2023 2023. Information technology — Database languages SQL — Part 1: Framework (SQL/Framework). Standard. International Organization for Standardization. https://www.iso.org/standard/76583.html
[13]
Holger Knublauch and Dimitris Kontokostas. 2017. Shapes Constraint Language (SHACL). Recommendation. W3C. https://www.w3.org/TR/shacl/
[14]
Jose Emilio Labra Gayo, Herminio Garciía González, Daniel Fernández Álvarez, and Eric Prud’hommeaux. 2019. Challenges in RDF Validation. Studies in Computational Intelligence, Vol. 815. Springer International Ablex Publishing Co, Chapter 6, 121–151. https://doi.org/10.1007/978-3-030-06149-4_6
[15]
Nandana Mihindukulasooriya, Mohammad Rifat Ahmmad Rashid, Giuseppe Rizzo, Raul Garcia-Castro, Oscar Corcho, and Marco Torchiano. 2018. RDF Shape Induction using Knowledge Base Profiling. In Proceedings of the 33rd ACM/SIGAPP Symposium On Applied Computing(SAC ’18). 1952–1959. https://doi.org/10.1145/3167132.3167341
[16]
Kashif Rabbani, Matteo Lissandrini, and Katja Hose. 2023. Extraction of Validating Shapes from Very Large Knowledge Graphs. Proceedings of the VLDB Endowment 16, 5 (2023), 1023–1032. https://doi.org/10.14778/3579075.3579078
[17]
Edna Ruckhaus, Oscar Corcho, Julián Rojas, Dylan van Assche, Ivo Velitchkov, Pieter Colpaert, and Wouter Beek. 2023. ERA Vocabulary. https://doi.org/10.5281/zenodo.7775344
[18]
Blerina Spahiu, A. Maurino, and M. Palmonari. 2018. Towards Improving the Quality of Knowledge Graphs with Data-driven Ontology Patterns and SHACL. In Workshop on Ontology Design Patterns (WOP) at ISWC (Best Workshop Papers), Vol. 2195. CEUR Workshop Proceedings, 67–71.
[19]
Ratan Bahadur Thapa and Martin Giese. 2021. A Source-to-Target Constraint Rewriting for Direct Mapping. In The Semantic Web – ISWC 2021. Springer, 21–38. https://doi.org/10.1007/978-3-030-88361-4_2
[20]
Ratan Bahadur Thapa and Martin Giese. 2022. Mapping Relational Database Constraints to SHACL. In The Semantic Web – ISWC 2022: 21st International Semantic Web Conference, Virtual Event, October 23–27, 2022, Proceedings. Springer, 214–230. https://doi.org/10.1007/978-3-031-19433-7_13
[21]
Dylan Van Assche, Thomas Delva, Gerald Haesendonck, Pieter Heyvaert, Ben De Meester, and Anastasia Dimou. 2023. Declarative RDF graph generation from heterogeneous (semi-)structured data: A systematic literature review. Journal of Web Semantics 75 (2023), 100753. https://doi.org/10.1016/j.websem.2022.100753
[22]
Priscilla Walmsley. 2012. Definitive XML Schema, 2nd Edition. Pearson Education.
[23]
Austin Wright, Henry Andrews, Ben Hutton, and Greg Dennis. 2022. JSON Schema: A Media Type for Describing JSON Documents. Internet-Draft draft-bhutton-json-schema-01. IETF Secretariat.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
K-CAP '23: Proceedings of the 12th Knowledge Capture Conference 2023
December 2023
270 pages
ISBN:9798400701412
DOI:10.1145/3587259
  • Editors:
  • Brent Venable,
  • Daniel Garijo,
  • Brian Jalaian
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RDF shapes
  2. SHACL
  3. Validation
  4. XML Schema
  5. XSD

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

K-CAP '23
Sponsor:
K-CAP '23: Knowledge Capture Conference 2023
December 5 - 7, 2023
FL, Pensacola, USA

Acceptance Rates

Overall Acceptance Rate 55 of 198 submissions, 28%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 77
    Total Downloads
  • Downloads (Last 12 months)52
  • Downloads (Last 6 weeks)4
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media