skip to main content
10.1145/2628071.2628091acmconferencesArticle/Chapter ViewAbstractPublication PagespactConference Proceedingsconference-collections
research-article
Free access

KLA: a new algorithmic paradigm for parallel graph computations

Published: 24 August 2014 Publication History

Abstract

This paper proposes a new algorithmic paradigm - k-level asynchronous (KLA) - that bridges level-synchronous and asynchronous paradigms for processing graphs. The KLA paradigm enables the level of asynchrony in parallel graph algorithms to be parametrically varied from none (level-synchronous) to full (asynchronous). The motivation is to improve execution times through an appropriate trade-off between the use of fewer, but more expensive global synchronizations, as in level-synchronous algorithms, and more, but less expensive local synchronizations (and perhaps also redundant work), as in asynchronous algorithms. We show how common patterns in graph algorithms can be expressed in the KLA pardigm and provide techniques for determining k, the number of asynchronous steps allowed between global synchronizations. Results of an implementation of KLA in the STAPL Graph Library show excellent scalability on up to 96K cores and improvements of 10x or more over level-synchronous and asynchronous versions for graph algorithms such as breadth-first search, PageRank, k-core decomposition and others on certain classes of real-world graphs.

References

[1]
The graph 500 list. http://www.graph500.org, 2013.
[2]
Stanford Large Network Dataset Collection. http://snap.stanford.edu/data/index.html, 2013.
[3]
9th DIMACS Implementation Challenge. http://www.dis.uniroma1.it/challenge9/, 2013.
[4]
J. I. Alvarez-hamelin, A. Barrat, and A. Vespignani. Large scale networks fingerprinting and visualization using the k-core decomposition. In Adv. in Neural Inf. Proc. Syst. 18, pp. 41--50. MIT Press, 2006.
[5]
J. W. Berry, B. Hendrickson, S. Kahan, and P. Konecny. Software and algorithms for graph queries on multithreaded architectures. In Intl. Parallel and Distributed Processing Symp., 0:495, 2007.
[6]
U. Brandes. A faster algorithm for betweenness centrality. J. of Math. Sociology, pp. 163--177, 2001.
[7]
S. Brin and L. Page. The anatomy of a large-scale hypertextual web search engine. Computer Networks and ISDN Systems, pp. 107--117, 1998.
[8]
A. Buluç and K. Madduri. Parallel breadth-first search on distributed memory systems. In Proc. of Intl. Conf. for High Performance Computing, Networking, Storage and Analysis, SC '11, pp. 1--12, New York, NY, USA, 2011.
[9]
A. Buss, A. Fidel, Harshvardhan, T. Smith, G. Tanase, N. Thomas, X. Xu, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL pView. In Intl. Workshop on Languages and Compilers for Parallel Computing (LCPC), in Lecture Notes in Computer Science (LNCS), Houston, TX, USA, 2010.
[10]
A. Buss, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, G. Tanase, N. Thomas, X. Xu, M. Bianco, N. M. Amato, and L. Rauchwerger. STAPL: Standard Template Adaptive Parallel Library. In Proc. Annual Haifa Experimental Systems Conference (SYSTOR), pp. 1--10, New York, NY, USA, 2010.
[11]
N. Edmonds, J. Willcock, and A. Lumsdaine. Expressing graph algorithms using generalized active messages. In Proc. of Symp. on Principles and Practice of Parallel Programming, PPoPP '13, pp. 289--290, New York, NY, USA, 2013.
[12]
R. G. Gallager, P. A. Humblet, and P. M. Spira. A distributed algorithm for minimum-weight spanning trees. In Trans. Program. Lang. Syst., pp. 66--77, 1983.
[13]
D. Gregor and A. Lumsdaine. The Parallel BGL: A generic library for distributed graph computations. In Parallel Object-Oriented Scientific Computing, POOSC, 2005.
[14]
Harshvardhan, A. Fidel, N. M. Amato, and L. Rauchwerger. The STAPL Parallel Graph Library. In Languages and Compilers for Parallel Computing, Lecture Notes in Computer Science, pp. 46--60. Springer Berlin Heidelberg, 2012.
[15]
M. A. Hassaan, M. Burtscher, and K. Pingali. Ordered and unordered algorithms for parallel breadth first search. In Proc. of the Intl. Conf. on Parallel Architectures and Compilation Techniques, PACT '10, pp. 539--540, New York, NY, USA, 2010.
[16]
S. Hong, H. Chafi, E. Sedlar, and K. Olukotun. Green-Marl: A DSL for easy and efficient graph analysis. In Proc. of the Intl. Conf. on Architectural Support for Prog. Languages and Operating Syst., ASPLOS'12, pp. 349--362, New York, NY, USA, 2012.
[17]
S. Hong, T. Oguntebi, and K. Olukotun. Efficient parallel graph exploration for multi-core cpu and gpu. In Proc. of the Intl. Conf. on Parallel Architectures and Compilation Techniques, PACT '11, pp. 78--88.
[18]
J. JàJà. An Introduction Parallel Algorithms. Addison-Wesley, Reading, Massachusetts, 1992.
[19]
C. E. Leiserson and T. B. Schardl. A work-efficient parallel breadth-first search algorithm (or how to cope with the nondeterminism of reducers). In Proc. of the Symp. on Parallelism in Algorithms and Architectures, SPAA '10, pp. 303--314, New York, NY, USA, 2010.
[20]
Y. Low, D. Bickson, J. Gonzalez, C. Guestrin, A. Kyrola, and J. M. Hellerstein. Distributed Graphlab: A framework for machine learning and data mining in the cloud. Proc. of the VLDB Endowment, pp. 716--727, 2012.
[21]
G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A system for large-scale graph processing. In Proc. of the Intl. Conf. on Management of data, SIGMOD '10, pp. 135--146, New York, NY, USA, 2010.
[22]
U. Meyer and P. Sanders. Delta-stepping : A parallel single source shortest path algorithm. In ESA '98: Proc. of the European Symp. on Algorithms, pp. 393--404. Springer-Verlag, 1998.
[23]
L. Page, S. Brin, R. Motwani and T. Winograd. The PageRank Citation Ranking: Bringing Order to the Web. 1998.
[24]
R. Pearce, M. Gokhale, and N. M. Amato. Multithreaded asynchronous graph traversal for in-memory and semi-external memory. In Proc. of Intl. Conf. for High Performance Computing, Networking, Storage and Analysis, SC '10, pp. 1--11, Washington, DC, USA, 2010.
[25]
D. Prountzos, R. Manevich, and K. Pingali. Elixir: A system for synthesizing concurrent graph programs. In Proc. of the Intl. Conf. on Object Oriented Program. Syst. Languages and Applications, OOPSLA '12, pp. 375--394, New York, NY, USA, 2012.
[26]
M. J. Quinn and N. Deo. Parallel graph algorithms. In ACM Computing Surveys (CSUR), pp. 319--348, 1984.
[27]
J. H. Reif, editor. Synthesis of Parallel Algorithms. Morgan Kaufmann, San Mateo, CA, 1993.
[28]
P. Stutz, A. Bernstein, and W. Cohen. Signal/collect: Graph algorithms for the (semantic) web. In The Semantic Web-ISWC '10, pp. 764--780. Springer, 2010.
[29]
G. Tanase, A. Buss, A. Fidel, Harshvardhan, I. Papadopoulos, O. Pearce, T. Smith, N. Thomas, X. Xu, N. Mourad, J. Vu, M. Bianco, N. M. Amato, and L. Rauchwerger. The STAPL Parallel Container Framework. In Proc. of Symp. on Principles and Practice of Parallel Programming, PPoPP, pp. 235--246, San Antonio, TX, USA, 2011.
[30]
L. Valiant. Bridging model for parallel computation. Comm. ACM, pp. 103--111, 1990.
[31]
G. Wang, W. Xie, A. J. Demers, and J. Gehrke. Asynchronous large-scale graph processing made easy. In CIDR, 2013.
[32]
D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks. Nature, pp. 440--442, 1998.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PACT '14: Proceedings of the 23rd international conference on Parallel architectures and compilation
August 2014
514 pages
ISBN:9781450328098
DOI:10.1145/2628071
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 August 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. asynchronous graph algorithms
  2. big data
  3. distributed computing
  4. graph analytics
  5. parallel algorithms

Qualifiers

  • Research-article

Funding Sources

Conference

PACT '14
Sponsor:
  • IFIP WG 10.3
  • SIGARCH
  • IEEE CS TCPP
  • IEEE CS TCAA

Acceptance Rates

PACT '14 Paper Acceptance Rate 54 of 144 submissions, 38%;
Overall Acceptance Rate 121 of 471 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)143
  • Downloads (Last 6 weeks)22
Reflects downloads up to 27 Dec 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media