skip to main content
research-article

Fast and robust distributed subgraph enumeration

Published: 01 July 2019 Publication History

Abstract

We study the subgraph enumeration problem under distributed settings. Existing solutions either suffer from severe memory crisis or rely on large indexes, which makes them impractical for very large graphs. Most of them follow a synchronous model where the performance is often bottlenecked by the machine with the worst performance. Motivated by this, in this paper, we propose RADS, a Robust Asynchronous Distributed Subgraph enumeration system. RADS first identifies results that can be found using single-machine algorithms. This strategy not only improves the overall performance but also reduces network communication and memory cost. Moreover, RADS employs a novel region-grouped multi-round expand verify & filter framework which does not need to shuffle and exchange the intermediate results, nor does it need to replicate a large part of the data graph in each machine. This feature not only reduces network communication cost and memory usage, but also allows us to adopt simple strategies for memory control and load balancing, making it more robust. Several optimization strategies are also used in RADS to further improve the performance. Our experiments verified the superiority of RADS to state-of-the-art subgraph enumeration approaches.

References

[1]
F. N. Afrati, D. Fotakis, and J. D. Ullman. Enumerating subgraph instances using map-reduce. In ICDE, pages 62--73, 2013.
[2]
K. Ammar, F. McSherry, S. Salihoglu, and M. Joglekar. Distributed evaluation of subgraph queries using worst-case optimal and low-memory dataflows. PVLDB, 11(6):691--704, 2018.
[3]
R. J. Douglas. Np-completeness and degree restricted spanning trees. Discrete Mathematics, 105(1-3):41--47, 1992.
[4]
W. Fan, P. Lu, X. Luo, J. Xu, Q. Yin, W. Yu, and R. Xu. Adaptive asynchronous parallelization of graph algorithms. In SIGMOD, pages 1141--1156, 2018.
[5]
W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, and C. Tian. Parallelizing sequential graph computations. In SIGMOD, pages 495--510, 2017.
[6]
H. Fernau, J. Kneis, D. Kratsch, A. Langer, M. Liedloff, D. Raible, and P. Rossmanith. An exact algorithm for the maximum leaf spanning tree problem. Theoretical Computer Science, 412(45):6290--6302, 2011.
[7]
J. A. Grochow and M. Kellis. Network motif discovery using subgraph enumeration and symmetry-breaking. In RECOMB, volume 4453, pages 92--106, 2007.
[8]
W. Gropp. MPICH2: A new start for mpi implementations. In PVM/MPI, pages 7--7, 2002.
[9]
W.-S. Han, J. Lee, and J.-H. Lee. TurboIso: towards ultrafast and robust subgraph isomorphism search in large graph databases. In SIGMOD, pages 337--348, 2013.
[10]
G. Karypis and V. Kumar. Metis - unstructured graph partitioning and sparse matrix ordering system, version 2.0. Technical report, 1995.
[11]
H. Kim, J. Lee, S. S. Bhowmick, W. Han, J. Lee, S. Ko, and M. H. A. Jarrah. DUALSIM: parallel subgraph enumeration in a massive graph on a single machine. In SIGMOD, pages 1231--1245, 2016.
[12]
L. Lai, L. Qin, X. Lin, and L. Chang. Scalable subgraph enumeration in mapreduce. PVLDB, 8(10):974--985, 2015.
[13]
L. Lai, L. Qin, X. Lin, Y. Zhang, and L. Chang. Scalable distributed subgraph enumeration. PVLDB, 10(3):217--228, 2016.
[14]
J. Lee, W. Han, R. Kasperovics, and J. Lee. An in-depth comparison of subgraph isomorphism algorithms in graph databases. PVLDB, 6(2):133--144, 2012.
[15]
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: a system for large-scale graph processing. In PODS, page 6, 2009.
[16]
M. Qiao, H. Zhang, and H. Cheng. Subgraph matching: on compression and computation. PVLDB, 11(2):176--188, 2017.
[17]
X. Ren and J. Wang. Exploiting vertex relationships in speeding up subgraph isomorphism over large graphs. PVLDB, 8(5):617--628, 2015.
[18]
X. Ren, J. Wang, W.-S. Han, and J. X. Yu. Fast and robust distributed subgraph enumeration. arXiv preprint arXiv:1901.07747, 2019.
[19]
B. Schling. The Boost C++ Libraries. XML Press, 2011.
[20]
Y. Shao, B. Cui, L. Chen, L. Ma, J. Yao, and N. Xu. Parallel subgraph listing in a large-scale graph. In SIGMOD, pages 625--636, 2014.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 12, Issue 11
July 2019
543 pages

Publisher

VLDB Endowment

Publication History

Published: 01 July 2019
Published in PVLDB Volume 12, Issue 11

Author Tags

  1. asynchronous
  2. distributed system
  3. subgraph enumeration

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media