research-article

Free access

KLA: a new algorithmic paradigm for parallel graph computations

Authors:

Nancy M. Amato,

Lawrence RauchwergerAuthors Info & Claims

PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation

Pages 27 - 38

https://doi.org/10.1145/2628071.2628091

Published: 24 August 2014 Publication History

Abstract

This paper proposes a new algorithmic paradigm - k-level asynchronous (KLA) - that bridges level-synchronous and asynchronous paradigms for processing graphs. The KLA paradigm enables the level of asynchrony in parallel graph algorithms to be parametrically varied from none (level-synchronous) to full (asynchronous). The motivation is to improve execution times through an appropriate trade-off between the use of fewer, but more expensive global synchronizations, as in level-synchronous algorithms, and more, but less expensive local synchronizations (and perhaps also redundant work), as in asynchronous algorithms. We show how common patterns in graph algorithms can be expressed in the KLA pardigm and provide techniques for determining k, the number of asynchronous steps allowed between global synchronizations. Results of an implementation of KLA in the STAPL Graph Library show excellent scalability on up to 96K cores and improvements of 10x or more over level-synchronous and asynchronous versions for graph algorithms such as breadth-first search, PageRank, k-core decomposition and others on certain classes of real-world graphs.

References

[1]

The graph 500 list. http://www.graph500.org, 2013.

[2]

Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/index.html, 2013.

[3]

9th DIMACS Implementation Challenge. http://www.dis.uniroma1.it/challenge9/, 2013.

[4]

J. I. Alvarez-hamelin, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the k-core decomposition. In Adv. in Neural Inf. Proc. Syst. 18, pp. 41--50. MIT Press, 2006.

[5]

J. W. Berry, B. Hendrickson, S. Kahan, and P. Konecny. Software and algorithms for graph queries on multithreaded architectures. In Intl. Parallel and Distributed Processing Symp., 0:495, 2007.

[6]

U. Brandes. A faster algorithm for betweenness centrality. J. of Math. Sociology, pp. 163--177, 2001.

[7]

S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, pp. 107--117, 1998.

Digital Library

[8]

A. Buluç and K. Madduri. Parallel breadth-first search on distributed memory systems. In Proc. of Intl. Conf. for High Performance Computing, Networking, Storage and Analysis, SC '11, pp. 1--12, New York, NY, USA, 2011.

Digital Library

[9]

A. Buss, A. Fidel, Harshvardhan, T. Smith, G. Tanase, N. Thomas, X. Xu, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL pView. In Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC), in Lecture Notes in Computer Science (LNCS), Houston, TX, USA, 2010.

Digital Library

[10]

A. Buss, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, G. Tanase, N. Thomas, X. Xu, M. Bianco, N. M. Amato, and L. Rauchwerger. STAPL: Standard Template Adaptive Parallel Library. In Proc. Annual Haifa Experimental Systems Conference (SYSTOR), pp. 1--10, New York, NY, USA, 2010.

Digital Library

[11]

N. Edmonds, J. Willcock, and A. Lumsdaine. Expressing graph algorithms using generalized active messages. In Proc. of Symp. on Principles and Practice of Parallel Programming, PPoPP '13, pp. 289--290, New York, NY, USA, 2013.

Digital Library

[12]

R. G. Gallager, P. A. Humblet, and P. M. Spira. A distributed algorithm for minimum-weight spanning trees. In Trans. Program. Lang. Syst., pp. 66--77, 1983.

Digital Library

[13]

D. Gregor and A. Lumsdaine. The Parallel BGL: A generic library for distributed graph computations. In Parallel Object-Oriented Scientific Computing, POOSC, 2005.

[14]

Harshvardhan, A. Fidel, N. M. Amato, and L. Rauchwerger. The STAPL Parallel Graph Library. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pp. 46--60. Springer Berlin Heidelberg, 2012.

[15]

M. A. Hassaan, M. Burtscher, and K. Pingali. Ordered and unordered algorithms for parallel breadth first search. In Proc. of the Intl. Conf. on Parallel Architectures and Compilation Techniques, PACT '10, pp. 539--540, New York, NY, USA, 2010.

Digital Library

[16]

S. Hong, H. Chafi, E. Sedlar, and K. Olukotun. Green-Marl: A DSL for easy and efficient graph analysis. In Proc. of the Intl. Conf. on Architectural Support for Prog. Languages and Operating Syst., ASPLOS'12, pp. 349--362, New York, NY, USA, 2012.

Digital Library

[17]

S. Hong, T. Oguntebi, and K. Olukotun. Efficient parallel graph exploration for multi-core cpu and gpu. In Proc. of the Intl. Conf. on Parallel Architectures and Compilation Techniques, PACT '11, pp. 78--88.

Digital Library

[18]

J. JàJà. An Introduction Parallel Algorithms. Addison-Wesley, Reading, Massachusetts, 1992.

Digital Library

[19]

C. E. Leiserson and T. B. Schardl. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In Proc. of the Symp. on Parallelism in Algorithms and Architectures, SPAA '10, pp. 303--314, New York, NY, USA, 2010.

Digital Library

[20]

Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed Graphlab: A framework for machine learning and data mining in the cloud. Proc. of the VLDB Endowment, pp. 716--727, 2012.

Digital Library

[21]

G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In Proc. of the Intl. Conf. on Management of data, SIGMOD '10, pp. 135--146, New York, NY, USA, 2010.

Digital Library

[22]

U. Meyer and P. Sanders. Delta-stepping : A parallel single source shortest path algorithm. In ESA '98: Proc. of the European Symp. on Algorithms, pp. 393--404. Springer-Verlag, 1998.

Digital Library

[23]

L. Page, S. Brin, R. Motwani and T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. 1998.

[24]

R. Pearce, M. Gokhale, and N. M. Amato. Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In Proc. of Intl. Conf. for High Performance Computing, Networking, Storage and Analysis, SC '10, pp. 1--11, Washington, DC, USA, 2010.

Digital Library

[25]

D. Prountzos, R. Manevich, and K. Pingali. Elixir: A system for synthesizing concurrent graph programs. In Proc. of the Intl. Conf. on Object Oriented Program. Syst. Languages and Applications, OOPSLA '12, pp. 375--394, New York, NY, USA, 2012.

Digital Library

[26]

M. J. Quinn and N. Deo. Parallel graph algorithms. In ACM Computing Surveys (CSUR), pp. 319--348, 1984.

Digital Library

[27]

J. H. Reif, editor. Synthesis of Parallel Algorithms. Morgan Kaufmann, San Mateo, CA, 1993.

Digital Library

[28]

P. Stutz, A. Bernstein, and W. Cohen. Signal/collect: Graph algorithms for the (semantic) web. In The Semantic Web-ISWC '10, pp. 764--780. Springer, 2010.

Digital Library

[29]

G. Tanase, A. Buss, A. Fidel, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, N. Thomas, X. Xu, N. Mourad, J. Vu, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL Parallel Container Framework. In Proc. of Symp. on Principles and Practice of Parallel Programming, PPoPP, pp. 235--246, San Antonio, TX, USA, 2011.

Digital Library

[30]

L. Valiant. Bridging model for parallel computation. Comm. ACM, pp. 103--111, 1990.

Digital Library

[31]

G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.

[32]

D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, pp. 440--442, 1998.

Cited By

Xu RWang YXiao X(2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00169
Koohi Esfahani MBoldi PVandierendonck HKilpatrick PVigna S(2023)On Overcoming HPC Challenges of Trillion-Scale Real-World Graph Datasets2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386309(215-220)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386309
Kumar PSingh A(2022)VDS: A Variant of Δ-stepping Algorithm for Parallel SSSP Problem2022 IEEE International Conference on Data Science and Information System (ICDSIS)10.1109/ICDSIS55133.2022.9915894(1-7)Online publication date: 29-Jul-2022
https://doi.org/10.1109/ICDSIS55133.2022.9915894
Show More Cited By

Index Terms

KLA: a new algorithmic paradigm for parallel graph computations
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel programming languages
2. Software and its engineering
  1. Software notations and tools
    1. General programming languages
      1. Language types
        Parallel programming languages
    2. Software libraries and repositories

Recommendations

Processing Big Data Graphs on Memory-Restricted Systems
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation

With the advent of big-data, processing large graphs quickly has become increasingly important. Most existing approaches either utilize in-memory processing techniques, which can only process graphs that fit completely in RAM, or disk-based techniques ...
A hierarchical approach to reducing communication in parallel graph algorithms
PPoPP 2015: Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Large-scale graph computing has become critical due to the ever-increasing size of data. However, distributed graph computations are limited in their scalability and performance due to the heavy communication inherent in such computations. This is ...
A hierarchical approach to reducing communication in parallel graph algorithms
PPoPP '15

Large-scale graph computing has become critical due to the ever-increasing size of data. However, distributed graph computations are limited in their scalability and performance due to the heavy communication inherent in such computations. This is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation

August 2014

514 pages

ISBN:9781450328098

DOI:10.1145/2628071

General Chair:
J. Nelson Amaral
University of Alberta, Canada
,
Program Chair:
Josep Torrellas
University of Illinois, USA

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

IFIP WG 10.3: IFIP WG 10.3
SIGARCH: ACM Special Interest Group on Computer Architecture
IEEE CS TCPP: IEEE Computer Society Technical Committee on Parallel Processing
IEEE CS TCAA: IEEE CS technical committee on architectural acoustics

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

PACT '14

Sponsor:

IFIP WG 10.3
SIGARCH
IEEE CS TCPP
IEEE CS TCAA

PACT '14: International Conference on Parallel Architectures and Compilation

August 24 - 27, 2014

AB, Edmonton, Canada

Acceptance Rates

PACT '14 Paper Acceptance Rate 54 of 144 submissions, 38%;

Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
1,070
Total Downloads

Downloads (Last 12 months)143
Downloads (Last 6 weeks)22

Reflects downloads up to 27 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xu RWang YXiao X(2024)Graph Computation with Adaptive Granularity2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00169(2123-2136)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00169
Koohi Esfahani MBoldi PVandierendonck HKilpatrick PVigna S(2023)On Overcoming HPC Challenges of Trillion-Scale Real-World Graph Datasets2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386309(215-220)Online publication date: 15-Dec-2023
https://doi.org/10.1109/BigData59044.2023.10386309
Kumar PSingh A(2022)VDS: A Variant of Δ-stepping Algorithm for Parallel SSSP Problem2022 IEEE International Conference on Data Science and Information System (ICDSIS)10.1109/ICDSIS55133.2022.9915894(1-7)Online publication date: 29-Jul-2022
https://doi.org/10.1109/ICDSIS55133.2022.9915894
Mutlu OGhose SGómez-Luna JAusavarungnirun R(2022)A Modern Primer on Processing in MemoryEmerging Computing: From Devices to Systems10.1007/978-981-16-7487-7_7(171-243)Online publication date: 9-Jul-2022
https://doi.org/10.1007/978-981-16-7487-7_7
Singh SShah TNasre R(2021)ParTBC: Faster Estimation of Top-k Betweenness Centrality Vertices on GPUACM Transactions on Design Automation of Electronic Systems10.1145/348661327:2(1-25)Online publication date: 2-Nov-2021
https://dl.acm.org/doi/10.1145/3486613
Koohi Esfahani MKilpatrick PVandierendonck H(2021)Thrifty Label Propagation: Fast Connected Components for Skewed-Degree Graphs2021 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/Cluster48925.2021.00042(226-237)Online publication date: Sep-2021
https://doi.org/10.1109/Cluster48925.2021.00042
Xu CVora KGupta RBahar IHerlihy MWitchel ELebeck A(2019)PnPProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304012(587-600)Online publication date: 4-Apr-2019
https://dl.acm.org/doi/10.1145/3297858.3304012
Tian ZChen YZhang L(2019)An Asynchronous Algorithm for Optimizing the Communication Performance2019 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00146(1017-1024)Online publication date: Dec-2019
https://doi.org/10.1109/ISPA-BDCloud-SustainCom-SocialCom48970.2019.00146
Karamati SYoung JVuduc R(2018)An Energy-Efficient Single-Source Shortest Path Algorithm2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2018.00117(1080-1089)Online publication date: May-2018
https://doi.org/10.1109/IPDPS.2018.00117
Firoz JZalewski MSuetterlein JLumsdaine A(2018)Adaptive Runtime Features for Distributed Graph Algorithms2018 IEEE 25th International Conference on High Performance Computing (HiPC)10.1109/HiPC.2018.00018(82-91)Online publication date: Dec-2018
https://doi.org/10.1109/HiPC.2018.00018
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents