research-article

Hypergraph-based locality-enhancing methods for graph operations in Big Data applications

Author:

Kadir AkbudakAuthors Info & Claims

The International Journal of High Performance Computing Applications, Volume 38, Issue 3

Pages 210 - 224

https://doi.org/10.1177/10943420231214532

Published: 01 May 2024 Publication History

Abstract

The need for speeding up data analytics increases inevitably due to the need for extracting valuable information from social media, data generated by smart devices with sensors, patterns of people’s communications over the web, items viewed and bought by global-scale customers, cloud applications, etc., all of which take part in the “Big Data.” Such kind of interaction data is very well represented as sparse graphs to enable the graph analytics, which requires efficient underlying kernels. The breadth-first search (BFS)-based traversal is a commonly used kernel in graph algorithms such as the betweenness centrality algorithm for centrality analysis. In this work, we focus on parallel BFS operations and propose hypergraph-based combinatorial models that aim at reducing cache misses and hence exploiting data locality during the parallel BFS operations. Our models are based on finding new vertex visit orders so that locality in accessing the data associated with vertices is exploited. Experiments on graphs arising in a wide range of applications show that our proposed models achieve on average 9% performance improvement in the CPU-based Ligra data analytics framework.

References

[1]

Akbudak K and Aykanat C (2017) Exploiting locality in sparse matrix-matrix multiplication on many-core architectures. IEEE Transactions on Parallel and Distributed Systems 28(8): 2258–2271.

Digital Library

[2]

Aliyev F, Urkmez T, and Wagner R (2019) A comprehensive look at luxury brand marketing research from 2000 to 2016: a bibliometric study and content analysis. Management Review Quarterly 69(3): 233–264.

[3]

Anderson MJ, Sundaram N, and Satish N, et al. (2016) GraphPad: optimized graph primitives for parallel and distributed platforms. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 313–322.

[4]

Apache Giraph (2012) Available at: http://giraph.apache.org/(Accessed 16 June 2020).

[5]

Arai J, Shiokawa H, and Yamamuro T, et al. (2016) Rabbit order: just-in-time parallel reordering for fast graph analysis. In: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 22–31.

[6]

Bader DA and Madduri K (2006) Parallel algorithms for evaluating centrality indices in real-world networks. In: 2006 International Conference on Parallel Processing (ICPP’06). 539–550.

Digital Library

[7]

Balaji V and Lucia B (2018) When is graph reordering an optimization? studying the effect of lightweight graph reordering across applications and input graphs. In: 2018 IEEE International Symposium on Workload Characterization (IISWC). 203–214.

[8]

Barik R, Minutoli M, and Halappanavar M, et al. (2020) Vertex reordering for real-world graphs and applications: an empirical evaluation. In: 2020 IEEE International Symposium on Workload Characterization (IISWC). IEEE, 240–251.

[9]

Beamer S, Asanovic K, and Patterson D (2012) Direction-optimizing breadth-first search. In: SC’12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1–10.

Digital Library

[10]

Beamer S, Asanovic K, and Patterson D (2015a) The GAP Benchmark Suite.

[11]

Beamer S, Asanovic K, and Patterson D (2015b) Locality exists in graph processing: workload characterization on an ivy bridge server. In: 2015 IEEE International Symposium on Workload Characterization. 56–65.

Digital Library

[12]

Beamer S, Asanović K, and Patterson D (2017) Reducing pagerank communication via propagation blocking. In: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS). IEEE, 820–831.

[13]

Berge C (1984) Hypergraphs: Combinatorics of Finite Sets, Volume 45. Elsevier.

[14]

Besta M, Podstawski M, and Groner L, et al. (2017) To push or to pull: on reducing communication and synchronization in graph computations. In: Proceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing, HPDC’17. New York, NY, USA: ACM. ISBN 9781450346993, 93–104.

Digital Library

[15]

Borgatti SP (2005) Centrality and network flow. Social Networks 27(1): 55–71.

[16]

Borkar S and Chien AA (2011) The future of microprocessors. Communications of the ACM 54(5): 67–77.

Digital Library

[17]

Brandes U (2001) A faster algorithm for betweenness centrality. Journal of Mathematical Sociology 25(2): 163–177.

[18]

Buluç A and Gilbert JR (2011) The Combinatorial BLAS: design, implementation, and applications. The International Journal of High Performance Computing Applications 25(4): 496–509.

Digital Library

[19]

Buluc A and Madduri K (2013) Graph partitioning for scalable distributed graph computations. Graph Partitioning and Graph Clustering 588: 83.

[20]

Catalyurek UV and Aykanat C (1999) Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication. IEEE Transactions on Parallel and Distributed Systems 10(7): 673–693.

Digital Library

[21]

Catanese S, De Meo P, and Ferrara E, et al. (2012) Extraction and analysis of Facebook friendship relations. In: Computational Social Networks. Springer, 291–324.

[22]

Chen R, Shi J, and Chen Y, et al. (2015) Powerlyra: differentiated graph computation and partitioning on skewed graphs. In: Proceedings of the Tenth European Conference on Computer Systems, EuroSys’15. New York, NY, USA: ACM. ISBN 9781450332385.

Digital Library

[23]

Cheng J, Shang Z, and Cheng H, et al. (2012) K-reach: who is in your small world. Proceedings of the VLDB Endowment 5(11).

[24]

Cherkassky BV, Goldberg AV, and Radzik T (1996) Shortest paths algorithms: theory and experimental evaluation. Mathematical Programming 73(2): 129–174.

Digital Library

[25]

Cuthill E and McKee J (1969) Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference. 157–172.

Digital Library

[26]

Davis TA and Hu Y (2011) The university of Florida sparse matrix collection. ACM Transactions on Mathematical Software 38(1).

[27]

DBG (2019) Available at: http://github.com/faldupriyank/dbg (Accessed 16 June 2020).

[28]

Ding Z, Hosoya R, and Kamioka T (2018) Co-purchase analysis by hierarchical network structure. PACIS 149.

[29]

Ediger D, Jiang K, and Riedy J, et al. (2010) Massive social network analysis: mining twitter for social good. In: 2010 39th International Conference on Parallel Processing. 583–593.

Digital Library

[30]

Faldu P, Diamond J, and Grot B (2019) A closer look at lightweight graph reordering. IISWC 1–13.

[31]

Faldu P, Diamond J, and Grot B (2020) Domain-specialized cache management for graph analytics. In: 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA). 234–248.

[32]

Freeman LC (1977) A set of measures of centrality based on betweenness. Sociometry 40(1): 35–41.

[33]

Fu Z, Personick M, and Thompson B (2014) Mapgraph: a high level api for fast development of high performance graph analytics on gpus. In: Proceedings of Workshop on GRAph Data Management Experiences and Systems, GRADES’14. New York, NY, USA: ACM, 1–6. ISBN 9781450329828.

Digital Library

[34]

Gonzalez JE, Low Y, and Gu H, et al. (2012) Powergraph: distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI’12. USA: USENIX Association, 17–30. ISBN 9781931971966.

Digital Library

[35]

Gorder (2016) Available at: http://github.com/datourat/Gorder (Accessed 16 June 2020).

[36]

Gregor D and Lumsdaine A (2005) The parallel bgl: a generic library for distributed graph computations. Parallel Object-Oriented Scientific Computing (POOSC) 2: 1–18.

[37]

Kang U, Tsourakakis CE, and Faloutsos C (2009) Pegasus: a peta-scale graph mining system implementation and observations. In: 2009 Ninth IEEE International Conference on Data Mining. IEEE, 229–238.

Digital Library

[38]

Kang U, Tsourakakis CE, and Faloutsos C (2011) Pegasus: mining peta-scale graphs. Knowledge and Information Systems 27(2): 303–325.

Digital Library

[39]

Karypis G and Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing 20(1): 359–392.

Digital Library

[40]

Khorasani F, Vora K, and Gupta R, et al. (2014) Cusha: vertex-centric graph processing on gpus. In: Proceedings of the 23rd International Symposium on High-Performance Parallel and Distributed Computing, HPDC’14. New York, USA: ACM. ISBN 9781450327497, 239–252.

Digital Library

[41]

Kulkarni M, Pingali K, and Walter B, et al. (2007) Optimistic parallelism requires abstractions. In: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI’07. New York, USA: ACM, 211–222. ISBN 9781595936332.

[42]

Kulkarni M, Burtscher M, and Casçaval C, et al. (2009) Lonestar: a suite of parallel irregular programs. In: ISPASS ’09: IEEE International Symposium on Performance Analysis of Systems and Software.

[43]

Kumar T, Vaidyanathan S, and Ananthapadmanabhan H, et al. (2020) Hypergraph clustering by iteratively reweighted modularity maximization. Applied Network Science 5(1): 1–22.

[44]

Kyrola A, Blelloch G, and Guestrin C (2012) Graphchi: large-scale graph computation on just a PC. In: Presented as Part of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI 12). Hollywood, CA: USENIX, 31–46. ISBN 978-1-931971-96-6.

[45]

Lakhotia K, Singapura S, and Kannan R, et al. (2017) Recall: reordered cache aware locality based graph processing. In: 2017 IEEE 24th International Conference on High Performance Computing (HiPC). 273–282.

[46]

Lakhotia K, Kannan R, and Pati S, et al. (2020) Gpop: a scalable cache- and memory-efficient framework for graph processing over parts. ACM Trans Parallel Comput 7(1).

[47]

Lee E, Kim J, and Lim K, et al. (2019) Pre-select static caching and neighborhood ordering for bfs-like algorithms on disk-based graph engines. In: 2019 USENIX Annual Technical Conference (USENIX ATC 19). Renton, WA: USENIX Association, 459–474. ISBN 978-1-939133-03-8.

[48]

Li Z, Ren S, and Lu S, et al. (2018) Concurrent hybrid breadth-first-search on distributed powergraph for skewed graphs. In: 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), 160–169. IEEE.

[49]

Ligra (2014) Available at: http://github.com/jshun/ligra/(Accessed 16 June 2020).

[50]

Low Y, Bickson D, and Gonzalez J, et al. (2012) Distributed GraphLab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow 5(8): 716–727.

Digital Library

[51]

Lugowski A, Alber D, and Buluç A, et al. (2012) A flexible open-source toolbox for scalable complex graph analysis. In: Proceedings of the Twelfth SIAM International Conference on Data Mining (SDM12). 930–941.

[52]

Madduri K, Ediger D, and Jiang K, et al. (2009) A faster parallel algorithm and efficient multithreaded implementations for evaluating betweenness centrality on massive datasets. In: 2009 IEEE International Symposium on Parallel Distributed Processing. 1–8.

Digital Library

[53]

Malewicz G, Austern MH, and Bik AJ, et al. (2010) Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data. 135–146.

Digital Library

[54]

Mattson T, Bader D, and Berry J, et al. (2013) Standards for graph algorithm primitives. In: 2013 IEEE High Performance Extreme Computing Conference (HPEC). 1–2.

[55]

Milroy DJ, Baker AH, and Hammerling DM, et al. (2019) Making root cause analysis feasible for large code bases: a solution approach for a climate model. In: Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing. 73–84.

[56]

Morselli C and Roy J (2008) Brokerage qualifications in ringing operations. Criminology 46(1): 71–98.

[57]

Murphy RC, Wheeler KB, and Barrett BW, et al. (2010) Introducing the graph 500. Cray Users Group (CUG) 19: 45–74.

[58]

Nam Y, Barnett GA, and Kim D (2014) Corporate hyperlink network relationships in global corporate social responsibility system. Quality and Quantity 48(3): 1225–1242.

[59]

Neo4j (2007) Available at: http://neo4j.com/(Accessed 16 June 2020).

[60]

Nguyen D, Lenharth A, and Pingali K (2013a) A lightweight infrastructure for graph analytics. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles, SOSP’13. New York, NY, USA: ACM, 456–471. ISBN 9781450323888.

Digital Library

[61]

Nguyen D, Lenharth A, and Pingali K (2013b) A lightweight infrastructure for graph analytics. In: Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles. 456–471.

Digital Library

[62]

Prabhakaran V, Wu M, and Weng X, et al. (2012) Managing large graphs on multi-cores with graph awareness. In: Presented as Part of the 2012 USENIX Annual Technical Conference (USENIX ATC 12). 41–52.

[63]

Puzis R, Altshuler Y, and Elovici Y, et al. (2013) Augmented betweenness centrality for environmentally aware traffic monitoring in transportation networks. Journal of Intelligent Transportation Systems 17(1): 91–105.

[64]

Rabbit (2016) Available at: https://github.com/araij/rabbit_order (Accessed 07 January 2023).

[65]

Shun J and Blelloch GE (2013) Ligra: a lightweight graph processing framework for shared memory. In: Proceedings of the 18th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP’13. New York, NY, USA: ACM, 135–146. ISBN 9781450319225.

Digital Library

[66]

Sundaram N, Satish N, and Patwary MMA, et al. (2015) Graphmat: high performance graph analytics made productive. Proceedings of the VLDB Endowment 8(11).

[67]

Unat D, Dubey A, and Hoefler T, et al. (2017) Trends in data locality abstractions for hpc systems. IEEE Transactions on Parallel and Distributed Systems 28(10): 3007–3020.

Digital Library

[68]

Van De Geijn RA and Watts J (1997) Summa: scalable universal matrix multiplication algorithm. Concurrency: Practice and Experience 9(4): 255–274.

Digital Library

[69]

Wang Y, Davidson A, and Pan Y, et al. (2016) Gunrock: a high-performance graph processing library on the gpu. In: Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP’16. New York, NY, USA: ACM. ISBN 9781450340922.

Digital Library

[70]

Wang Y, Pan Y, and Davidson A, et al. (2017) Gunrock: gpu graph analytics. ACM Trans. Parallel Comput 4(1).

[71]

Wang R, Wang S, and Zhou X (2019) Parallelizing approximate single-source personalized pagerank queries on shared memory. The VLDB Journal 28(6): 923–940.

[72]

Wei H, Yu JX, and Lu C, et al. (2016) Speedup graph processing by graph ordering. In: Proceedings of the International Conference on Management of Data, SIGMOD’16. New York, NY, USA: ACM, 1813–1828.

[73]

Wu S, Gong L, and Rand W, et al. (2012) Making recommendations in a microblog to improve the impact of a focal user. In: Proceedings of the Sixth ACM Conference on Recommender Systems. 265–268.

Digital Library

[74]

Xin RS, Gonzalez JE, and Franklin MJ, et al. (2013) Graphx: a resilient distributed graph system on spark. In: First International Workshop on Graph Data Management Experiences and Systems, GRADES’13. New York, NY, USA: ACM. ISBN 9781450321884.

[75]

Zhang Y, Kiriansky V, and Mendis C, et al. (2017) Making caches work for graph analytics. In: 2017 IEEE International Conference on Big Data (Big Data). 293–302.

[76]

Zhang Y, Yang M, and Baghdadi R, et al. (2018) Graphit: a high-performance graph DSL. Proceedings of the ACM on Programming Languages 2(OOPSLA): 1–30.

Digital Library

[77]

Zhou D, Huang J, and Schölkopf B (2006) Learning with hypergraphs: clustering, classification, and embedding. Advances in Neural Information Processing Systems 19: 1601–1608.

Recommendations

LSGraph: A Locality-centric High-performance Streaming Graph Engine
EuroSys '24: Proceedings of the Nineteenth European Conference on Computer Systems

Streaming graph has been broadly employed across various application domains. It involves updating edges to the graph and then performing analytics on the updated graph. However, existing solutions either suffer from poor data locality and high ...
Big Graph Processing Systems: State-of-the-Art and Open Challenges
BIGDATASERVICE '15: Proceedings of the 2015 IEEE First International Conference on Big Data Computing Service and Applications

Graph is a fundamental data structure that captures relationships between different data entities. In practice, graphs are widely used for modeling complicated data in different application domains such as social networks, protein networks, ...
The Ramsey number for hypergraph cycles I

Let C_n denote the 3-uniform hypergraph loose cycle, that is the hypergraph with vertices v₁.....,v_n and edges v₁v₂v₃, v₃v₄v₅, v₅v₆v₇,.....,v_n-1v_nv₁. We prove that every red-blue colouring of the edges of the complete 3-uniform hypergraph with N vertices ...

Comments

Information & Contributors

Information

Published In

cover image International Journal of High Performance Computing Applications

International Journal of High Performance Computing Applications Volume 38, Issue 3

May 2024

123 pages

Issue’s Table of Contents

© The Author(s) 2023.

Publisher

Sage Publications, Inc.

United States

Publication History

Published: 01 May 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents