skip to main content
10.1145/2627770.2627771acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
tutorial

Parasol: An Architecture for Cross-Cloud Federated Graph Querying

Published: 22 June 2014 Publication History

Abstract

Large scale data fusion of multiple datasets can often provide insights that individual datasets cannot. However, when these datasets reside in different data centers and cannot be collocated due to technical, administrative, or policy barriers, a unique set of problems arise that hamper querying and data fusion. To address these problems, a system and architecture named Parasol is presented that enables federated queries over graph databases residing in multiple clouds. Parasol's design is flexible and requires only minimal assumptions for client clouds. Query optimization techniques are also described that are compatible with Parasol's lightweight architecture. Experiments on a prototype implementation of Parasol indicate its suitability for cross-cloud federated graph queries.

References

[1]
K. J. Ahn, S. Guha, and A. McGregor. Graph sketches: sparsification, spanners, and subgraphs. In Proceedings of the 31st Symposium on Principles of Database Systems, PODS'12, pages 5--14, Scottsdale, AZ, May 2012.
[2]
F. Bancilhon, D. Maier, Y. Sagiv, and J. D. Ullman. Magic sets and other strange ways to implement logic programs (extended abstract). In Proceedings of the 5th ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, PODS'86, pages 1--15, Cambridge, MA, Mar. 1986.
[3]
F. Bugiotti, F. Goasdoué, Z. Kaoudi, and I. Manolescu. RDF data management in the Amazon cloud. In Proceedings of the Workshop on Data Analytics in the Cloud, DanaC'12, pages 61--72, Berlin, Mar. 2012.
[4]
A. Deshpande and J. M. Hellerstein. Decoupled query optimization for federated database systems. In Proceedings of the 18th International Conference on Data Engineering, ICDE'02, pages 716--727, San Jose, CA, Feb. 2002.
[5]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed graph-parallel computation on natural graphs. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI'12, pages 17--30, Hollywood, CA, Oct. 2012.
[6]
O. Hartig, C. Bizer, and J.-C. Freytag. Executing SPARQL queries over the web of linked data. In Proceedings of the 8th International Semantic Web Conference, ISWC'09, pages 293--309, Washington, DC, Oct. 2009.
[7]
J. M. Hellerstein and M. Stonebraker. Predicate migration: optimizing queries with expensive predicates. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, SIGMOD'93, pages 267--276, Washington, DC, May 1993.
[8]
J. Huang, D. J. Abadi, and K. Ren. Scalable SPARQL querying of large RDF graphs. Proceedings of the VLDB Endowment, 4 (11):1123--1134, Aug. 2011.
[9]
R. Krishnamurthy, H. Boral, and C. Zaniolo. Optimization of nonrecursive queries. In Proceedings of the 12th International Conference on Very Large Data Bases, VLDB'86, pages 128--137, Kyoto, Japan, Aug. 1986.
[10]
J. Lee, W.-S. Han, R. Kasperovics, and J.-H. Lee. An in-depth comparison of subgraph isomorphism algorithms in graph databases. Proceedings of the VLDB Endowment, 6(2):133--144, Dec. 2012.
[11]
E. Rahm and P. A. Bernstein. A survey of approaches to automatic schema matching. The VLDB Journal, 10(4):334--350, Dec. 2001.
[12]
R. Ramakrishnan and J. D. Ullman. A survey of deductive database systems. Journal of Logic Programming, 23(2):125--149, May 1995.
[13]
E. Ruckhaus, E. Ruiz, and M.-E. Vidal. Query evaluation and optimization in the semantic web. Theory and Practice of Logic Programming, 8(3):393--409, May 2008.
[14]
M. Sarwat, S. Elnikety, Y. He, and G. Kliot. Horton: Online query execution engine for large distributed graphs. In Proceedings of the 2012 IEEE 28th International Conference on Data Engineering, ICDE'12, pages 1289--1292, Washington, DC, Apr. 2012.
[15]
A. P. Sheth and J. A. Larson. Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys, 22(3):183--236, Sept. 1990.
[16]
Z. Sun, H. Wang, H. Wang, B. Shao, and J. Li. Efficient subgraph matching on billion node graphs. Proceedings of the VLDB Endowment, 5(9):788--799, May 2012.
[17]
S. Suri and S. Vassilvitskii. Counting triangles and the curse of the last reducer. In Proceedings of the 20th International Conference on World Wide Web, WWW'11, pages 607--614, Hyderabad, India, Mar. 2011.
[18]
Y. Wu, J. M. Patel, and H. V. Jagadish. Structural join order selection for XML query optimization. In Proceedings of the 19th International Conference on Data Engineering, ICDE'03, pages 443--454, Bangalore, India, Mar. 2003.
[19]
P. Zhao and J. Han. On graph query optimization in large networks. Proceedings of the VLDB Endowment, 3(1--2):340--351, Sept. 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DanaC'14: Proceedings of Workshop on Data analytics in the Cloud
June 2014
30 pages
ISBN:9781450329972
DOI:10.1145/2627770
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Graph database
  2. data fusion
  3. federated database
  4. query optimization

Qualifiers

  • Tutorial
  • Research
  • Refereed limited

Conference

SIGMOD/PODS'14
Sponsor:

Acceptance Rates

DanaC'14 Paper Acceptance Rate 6 of 12 submissions, 50%;
Overall Acceptance Rate 19 of 34 submissions, 56%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media