skip to main content
10.5555/2063016.2063055guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

FedX: optimization techniques for federated query processing on linked data

Published: 23 October 2011 Publication History

Abstract

Motivated by the ongoing success of Linked Data and the growing amount of semantic data sources available on theWeb, new challenges to query processing are emerging. Especially in distributed settings that require joining data provided by multiple sources, sophisticated optimization techniques are necessary for efficient query processing. We propose novel join processing and grouping techniques to minimize the number of remote requests, and develop an effective solution for source selection in the absence of preprocessed metadata. We present FedX, a practical framework that enables efficient SPARQL query processing on heterogeneous, virtually integrated Linked Data sources. In experiments, we demonstrate the practicability and efficiency of our framework on a set of real-world queries and data sources from the Linked Open Data cloud. With FedX we achieve a significant improvement in query performance over state-of-the-art federated query engines.

References

[1]
Alexander, K., Cyganiak, R., Hausenblas, M., Zhao, J.: Describing linked datasets - on the design and usage of void. In: Linked Data on the Web Workshop (LDOW 2009), in Conjunction with WWW 2009 (2009).
[2]
Buil-Aranda, C., Corcho, O., Arenas, M.: Semantics and Optimization of the SPARQL 1.1 Federation Extension. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011. LNCS, vol. 6644, pp. 1-15. Springer, Heidelberg (2011).
[3]
Berners-Lee, T.: Linked data - design issues (2006), http://www.w3.org/DesignIssues/LinkedData.html (retrieved August 25, 2011).
[4]
Bizer, C., Schultz, A.: The Berlin SPARQL Benchmark. Int. J. Semantic Web Inf. Syst. 5(2), 1-24 (2009).
[5]
Erling, O., Mikhailov, I.: RDF support in the virtuoso DBMS. In: CSSW. LNI, vol. 113, pp. 59-68. GI (2007).
[6]
Görlitz, O., Staab, S.: Federated Data Management and Query Optimization for Linked Open Data. In: Vakali, A., Jain, L.C. (eds.) New Directions in Web Data Management 1. SCI, vol. 331, pp. 109-137. Springer, Heidelberg (2011).
[7]
Harth, A., Hose, K., Karnstedt, M., Polleres, A., Sattler, K.-U., Umbrich, J.: Data summaries for on-demand queries over linked data. In: WWW (2010).
[8]
Hartig, O., Bizer, C., Freytag, J.-C.: Executing SPARQL Queries over the Web of Linked Data. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 293-309. Springer, Heidelberg (2009).
[9]
Hartig, O., Langegger, A.: A database perspective on consuming linked data on the web. Datenbank-Spektrum 10, 57-66 (2010).
[10]
Hose, K., Schenkel, R., Theobald, M., Weikum, G.: Database Foundations for Scalable RDF Processing. In: Polleres, A., d'Amato, C., Arenas, M., Handschuh, S., Kroner, P., Ossowski, S., Patel-Schneider, P. (eds.) Reasoning Web 2011. LNCS, vol. 6848, pp. 202-249. Springer, Heidelberg (2011).
[11]
Kossmann, D.: The state of the art in distributed query processing. ACM Computing Surveys 32(4), 422-469 (2000).
[12]
Ladwig, G., Tran, T.: Linked Data Query Processing Strategies. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 453-469. Springer, Heidelberg (2010).
[13]
Ladwig, G., Tran, T.: SIHJoin: Querying remote and local linked data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 139-153. Springer, Heidelberg (2011).
[14]
Langegger, A., Wöß, W., Blöchl, M.: A Semantic Web Middleware for Virtual Data Integration on the Web. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 493-507. Springer, Heidelberg (2008).
[15]
Neumann, T., Weikum, G.: The RDF-3X engine for scalable management of RDF data. The VLDB Journal 19, 91-113 (2010).
[16]
Quilitz, B., Leser, U.: Querying Distributed RDF Data Sources with SPARQL. In: Bechhofer, S., Hauswirth, M., Hoffmann, J., Koubarakis, M. (eds.) ESWC 2008. LNCS, vol. 5021, pp. 524-538. Springer, Heidelberg (2008).
[17]
Schmidt, M., Görlitz, O., Haase, P., Ladwig, G., Schwarte,A., Tran, T.: FedBench:A Benchmark Suite for Federated Semantic Data Query Processing. In: Aroyo, L., et al. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 585-600. Springer, Heidelberg (2011).
[18]
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP2Bench: A SPARQL Performance Benchmark. In: ICDE, pp. 222-233 (2009).
[19]
Schwarte, A.: FedX: Optimization Techniques for Federated Query Processing on Linked Data. Master's thesis, Saarland University, Germany (2011).
[20]
Schwarte, A., Haase, P., Hose, K., Schenkel, R., Schmidt, M.: FedX: A Federation Layer for Distributed Query Processing on Linked Open Data. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011. LNCS, vol. 6644, pp. 481-486. Springer, Heidelberg (2011).
[21]
Sheth, A.P.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. In: VLDB 1991, p. 489 (1991).
[22]
Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: SPARQL basic graph pattern optimization using selectivity estimation. In: WWW, pp. 595-604. ACM (2008).
[23]
Weiss, C., Karras, P., Bernstein, A.: Hexastore: sextuple indexing for semantic web data management. PVLDB 1(1), 1008-1019 (2008).
[24]
Zemanek, J., Schenk, S., Svatek, V.: Optimizing SPARQL Queries over Disparate RDF Data Sources through Distributed Semi-Joins. In: ISWC 2008 Poster and Demo Session Proceedings. CEUR-WS (2008).

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ISWC'11: Proceedings of the 10th international conference on The semantic web - Volume Part I
October 2011
809 pages
ISBN:9783642250729

Sponsors

  • Elsevier
  • Yahoo! Research
  • Microsoft Research: Microsoft Research
  • OASIS: OASIS
  • THESEUS: THESEUS

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 October 2011

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media