skip to main content
research-article

Incrementalization of graph partitioning algorithms

Published: 01 April 2020 Publication History

Abstract

This paper studies incremental graph partitioning. Given a (vertex-cut or edge-cut) partition C(G) of a graph G and updates ΔG to G, it is to compute changes ΔO to C(G), yielding a partition of the updated graph such that (a) the new partition is load-balanced, (b) its cut size is minimum, and (c) the changes ΔO are also minimum. We show that this tri-criteria optimization problem is NP-complete, even when ΔG has a constant size. Worse yet, it is unbounded, i.e., there exists no algorithm that computes such ΔO with a cost that is determined only by the changes ΔG and ΔO. We approach this by proposing to incrementalize widely-used graph partitioners A into heuristically-bounded incremental algorithms AΔ. Given graph G, updates ΔG to G and a partition A(G) of G by A, AΔ computes changes ΔO to A(G) such that (1) applying ΔO to A(G) produces a new partition of the updated graph although it may not be exactly the one derived by A, (2) it retains the same bounds on balance and cut sizes as A, and (3) ΔO is decided by ΔG alone. We show that we can deduce AΔ from both vertex-cut and edge-cut partitioners A, retaining their bounds. Using real-life and synthetic data, we verify the efficiency and partition quality of our incremental partitioners.

References

[1]
UKWeb. http://law.di.unimi.it/webdata/uk-2007-05/, 2011.
[2]
Friendster. https://snap.stanford.edu/data/com-Friendster.html, 2012.
[3]
LiveJournal. http://snap.stanford.edu/data/com-LiveJournal.html, 2012.
[4]
Road-ca. http://snap.stanford.edu/data/roadNet-CA.html, 2012.
[5]
Wiki-de. http://konect.uni-koblenz.de/networks/link-dynamic-dewiki, 2012.
[6]
Size of Wikipedia. https://en.wikipedia.org/wiki/ Wikipedia:Size of Wikipedia, 2020.
[7]
U. A. Acar. Self-Adjusting Computation. PhD thesis, CMU, 2005.
[8]
K. Andreev and H. Racke. Balanced graph partitioning. Theory Comput. Syst., 39(6):929--939, 2006.
[9]
C.-E. Bichot and P. Siarry. Graph partitioning. John Wiley & Sons, 2013.
[10]
F. Bourse, M. Lelarge, and M. Vojnovic. Balanced graph edge partition. In KDD, 2014.
[11]
T. N. Bui and C. Jones. A heuristic for reducing fill-in in sparse matrix factorization. In PPSC, 1993.
[12]
A. Buluc, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz. Recent advances in graph partitioning. In Algorithm Engineering - Selected Results and Surveys, pages 117--158. 2016.
[13]
Z. Cai, D. Logothetis, and G. Siganos. Facilitating real-time graph mining. In CloudDB, 2012.
[14]
R. Chen, J. Shi, Y. Chen, and H. Chen. Powerlyra: Differentiated graph computation and partitioning on skewed graphs. In EuroSys, 2015.
[15]
R. Diekmann, R. Preis, F. Schlimbach, and C. Walshaw. Shape-optimized mesh partitioning and load balancing for parallel adaptive FEM. Parallel Computing, 26(12):1555--1581, 2000.
[16]
W. Fan, C. Hu, and C. Tian. Incremental graph computations: Doable and undoable. In SIGMOD, 2017.
[17]
W. Fan, M. Liu, R. Xu, L. Hou, D. Li, and Z. Meng. Think sequential, run parallel. LNCS, 11180:1--25, 2018.
[18]
W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, and C. Tian. Parallelizing sequential graph computations. In SIGMOD, 2017.
[19]
M. Garey and D. Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company, 1979.
[20]
M. R. Garey, D. S. Johnson, and L. J. Stockmeyer. Some simplified NP-complete graph problems. Theor. Comput. Sci., 1(3):237--267, 1976.
[21]
A. George and J. W. Liu. Computer Solution of Large Sparse Positive Definite. Prentice Hall Professional Technical Reference, 1981.
[22]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, 2012.
[23]
J. E. Gonzalez, R. S. Xin, A. Dave, D. Crankshaw, M. J. Franklin, and I. Stoica. GraphX: Graph processing in a distributed data ow framework. In OSDI, 2014.
[24]
M. Hanai, T. Suzumura, W. J. Tan, E. S. Liu, G. Theodoropoulos, and W. Cai. Distributed edge partitioning for trillion-edge graphs. PVLDB, 12(13):2379--2392, 2019.
[25]
J. Huang and D. Abadi. LEOPARD: Lightweight edge-oriented partitioning and replication for dynamic graphs. PVLDB, 9(7):540--551, 2016.
[26]
G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SISC, 20(1):359--392, 1998.
[27]
G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. J. Parallel Distrib. Comput., 48(1):96--129, 1998.
[28]
G. Karypis and V. Kumar. A parallel algorithm for multilevel graph partitioning and sparse matrix ordering. J. Parallel Distrib. Comput., 48(1):71--95, 1998.
[29]
B. W. Kernighan and S. Lin. An efficient heuristic procedure for partitioning graphs. The Bell system technical journal, 49(2):291--307, 1970.
[30]
Z. Khayyat, K. Awara, A. Alonazi, H. Jamjoom, D. Williams, and P. Kalnis. Mizan: A system for dynamic load balancing in large-scale graph processing. In EuroSys, 2013.
[31]
M. Kim and K. S. Candan. SBV-Cut: Vertex-cut based graph partitioning using structural balance vertices. DKE, 72:285--303, 2012.
[32]
R. Krauthgamer, J. Naor, and R. Schwartz. Partitioning graphs into balanced components. In SODA, 2009.
[33]
Y. Low, J. Gonzalez, A. Kyrola, D. Bickson, C. Guestrin, and J. M. Hellerstein. Distributed GraphLab: A framework for machine learning in the cloud. PVLDB, 5(8):716--727, 2012.
[34]
D. W. Margo and M. I. Seltzer. A scalable distributed graph partitioner. PVLDB, 8(12):1478--1489, 2015.
[35]
C. Martella, D. Logothetis, A. Loukas, and G. Siganos. Spinner: Scalable graph partitioning in the cloud. In ICDE, 2017.
[36]
H. Meyerhenke, B. Monien, and S. Schamberger. Graph partitioning and disturbed diffusion. Parallel Computing, 35(10-11):544--569, 2009.
[37]
M. E. Newman. Clustering and preferential attachment in growing networks. Physical review E, 64(2):025102, 2001.
[38]
D. Nicoara, S. Kamali, K. Daudjee, and L. Chen. Hermes: Dynamic partitioning for distributed social network graph databases. In EDBT, 2015.
[39]
F. Petroni, L. Querzoni, K. Daudjee, S. Kamali, and G. Iacoboni. HDRF: Stream-based partitioning for power-law graphs. In CIKM, 2015.
[40]
A. Pothen, H. D. Simon, and K.-P. Liou. Partitioning sparse matrices with eigenvectors of graphs. SIMAX, 11(3):430--452, 1990.
[41]
M. Predari and A. Esnard. A k-way greedy graph partitioning with initial fixed vertices for parallel applications. In PDP, 2016.
[42]
U. N. Raghavan, R. Albert, and S. Kumara. Near linear time algorithm to detect community structures in largescale networks. Physical review E, 76(3):036106, 2007.
[43]
F. Rahimian, A. H. Payberah, S. Girdzijauskas, and S. Haridi. Distributed vertex-cut partitioning. In DAIS, 2014.
[44]
G. Ramalingam and T. Reps. On the computational complexity of dynamic graph problems. Theor. Comput. Sci., 158(1-2), 1996.
[45]
S. Salihoglu and J. Widom. GPS: A graph processing system. In SSDBM, 2013.
[46]
K. Schloegel, G. Karypis, and V. Kumar. Multilevel diffusion schemes for repartitioning of adaptive meshes. J. Parallel Distrib. Comput., 47(2):109--124, 1997.
[47]
K. Schloegel, G. Karypis, and V. Kumar. Parallel static and dynamic multi-constraint graph partitioning. Concurrency and Computation: Practice and Experience, 14(3):219--240, 2002.
[48]
Z. Shang and J. X. Yu. Catch the wind: Graph workload balancing on cloud. In ICDE, 2013.
[49]
A. Shankar and R. Bodik. DITTO: Automatic incrementalization of data structure invariant checks (in Java). In PLDI, 2007.
[50]
G. M. Slota, K. Madduri, and S. Rajamanickam. Pulp: Scalable multi-objective multi-constraint partitioning for small-world networks. In Big Data, 2014.
[51]
G. M. Slota, S. Rajamanickam, K. Devine, and K. Madduri. Partitioning trillion-edge graphs in minutes. In IPDPS, 2017.
[52]
I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, 2012.
[53]
T. Teitelbaum and T. W. Reps. The Cornell program synthesizer: A syntax-directed programming environment. CACM, 24(9), 1981.
[54]
C. E. Tsourakakis, C. Gkantsidis, B. Radunovic, and M. Vojnovic. FENNEL: Streaming graph partitioning for massive scale graphs. In WSDM, 2014.
[55]
L. Wang, Y. Xiao, B. Shao, and H. Wang. How to partition a billion-node graph. In ICDE, 2014.
[56]
C. Xie, L. Yan, W. Li, and Z. Zhang. Distributed power-law graph computing: Theoretical and empirical analysis. In NIPS, 2014.
[57]
N. Xu, L. Chen, and B. Cui. LogGP: A log-based dynamic graph partitioning method. PVLDB, 7(14):1917--1928, 2014.
[58]
T. A. K. Zakian, L. Capelli, and Z. Hu. Automatic incrementalization of vertex-centric programs. In IPDPS, 2019.
[59]
C. Zhang, F. Wei, Q. Liu, Z. G. Tang, and Z. Li. Graph edge partitioning via neighborhood heuristic. In KDD, 2017.

Cited By

View all
  1. Incrementalization of graph partitioning algorithms

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 13, Issue 8
    April 2020
    172 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 April 2020
    Published in PVLDB Volume 13, Issue 8

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)130
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 04 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media