short-paper

Open access

In situ neighborhood sampling for large-scale GNN training

Authors:

Vasiliki KalavriAuthors Info & Claims

DaMoN '24: Proceedings of the 20th International Workshop on Data Management on New Hardware

Article No.: 11, Pages 1 - 5

https://doi.org/10.1145/3662010.3663443

Published: 09 June 2024 Publication History

Abstract

Graph Neural Network (GNN) training algorithms commonly perform neighborhood sampling to construct fixed-size mini-batches for weight aggregation on GPUs. State-of-the-art disk-based GNN frameworks compute sampling on the CPU, transferring edge partitions from disk to memory for every mini-batch. We argue that this design incurs significant waste of PCIe bandwidth, as entire neighborhoods are transferred to main memory only to be discarded after sampling. In this paper, we make the first step towards an inherently different approach that harnesses near-storage compute technology to achieve efficient large-scale GNN training. We target a single machine with one or more SmartSSD devices and develop a high-throughput, epoch-wide sampling FPGA kernel that enables pipelining across epochs. When compared to a baseline random-access sampling kernel, our solution achieves up to 4.26× lower sampling time per epoch.

References

[1]

Deep graph library. https://www.dgl.ai. Last access: March 2024.

[2]

Nvidia gpudirect storage. https://docs.nvidia.com/gpudirect-storage/configuration-guide/index.html. Last access: March 2024.

[3]

Open graph benchmark. https://ogb.stanford.edu. Last access: March 2024.

[4]

Size of the world wide web. https://www.worldwidewebsize.com. Last access: March 2024.

[5]

Webscope graph and social data. https://webscope.sandbox.yahoo.com/catalog.php?datatype=g. Last access: March 2024.

[6]

An introduction to brain networks. In Alex Fornito, Andrew Zalesky, and Edward T. Bullmore, editors, Fundamentals of Brain Network Analysis, pages 1--35. Academic Press, San Diego, 2016.

[7]

Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. One trillion edges: Graph processing at facebook-scale. Proceedings of the VLDB Endowment, 8(12):1804--1815, 2015.

Digital Library

[8]

Sangyeun Cho, Chanik Park, Hyunok Oh, Sungchan Kim, Youngmin Yi, and Gregory R Ganger. Active disk meets flash: A case for intelligent ssds. In Proceedings of the 27th international ACM conference on International conference on supercomputing, pages 91--102, 2013.

Digital Library

[9]

Swapnil Gandhi and Anand Padmanabha Iyer. P3: Distributed deep graph learning at scale. In 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21), pages 551--568, 2021.

[10]

Hongyang Gao, Zhengyang Wang, and Shuiwang Ji. Large-scale learnable graph convolutional networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1416--1424, 2018.

Digital Library

[11]

Ping Gong, Renjie Liu, Zunyao Mao, Zhenkun Cai, Xiao Yan, Cheng Li, Minjie Wang, and Zhuozhao Li. gsampler: General and efficient gpu-based graph sampling for graph learning. In Proceedings of the 29th Symposium on Operating Systems Principles, SOSP '23, page 562--578, New York, NY, USA, 2023. Association for Computing Machinery.

Digital Library

[12]

Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.

[13]

Wenbing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. Adaptive sampling towards fast graph representation learning. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 4558--4567. 2018.

[14]

Abhinav Jangda, Sandeep Polisetty, Arjun Guha, and Marco Serafini. Accelerating graph sampling for graph machine learning using gpus. In Proceedings of the Sixteenth European Conference on Computer Systems, EuroSys '21, page 311--326, New York, NY, USA, 2021. Association for Computing Machinery.

Digital Library

[15]

Kimberly Keeton, David A Patterson, and Joseph M Hellerstein. A case for intelligent disks (idisks). Acm Sigmod Record, 27(3):42--52, 1998.

Digital Library

[16]

Joo Hwan Lee, Hui Zhang, Veronica Lagrange, Praveen Krishnamoorthy, Xiaodong Zhao, and Yang Seok Ki. Smartssd: Fpga accelerated near-storage data analytics on ssd. IEEE Computer architecture letters, 19(2):110--113, 2020.

[17]

Yunjae Lee, Jinha Chung, and Minsoo Rhu. Smartsage: training large-scale graph neural networks using in-storage processing architectures. In Proceedings of the 49th Annual International Symposium on Computer Architecture, ISCA '22, page 932--945, New York, NY, USA, 2022. Association for Computing Machinery.

[18]

Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. Pagraph: Scaling gnn training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing, pages 401--415, 2020.

Digital Library

[19]

Yao Ma, Suhang Wang, Charu C Aggarwal, and Jiliang Tang. Graph convolutional networks with eigenpooling. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 723--731, 2019.

Digital Library

[20]

Jeongmin Brian Park, Vikram Sharma Mailthody, Zaid Qureshi, and Wen-mei Hwu. Accelerating sampling and aggregation operations in gnn frameworks with gpu initiated direct storage accesses. arXiv preprint arXiv:2306.16384, 2023.

[21]

Yeonhong Park, Sunhong Min, and Jae W Lee. Ginex: Ssd-enabled billion-scale graph neural network training on a single machine via provably optimal in-memory caching. arXiv preprint arXiv:2208.09151, 2022.

[22]

Neha Prakriya, Yu Yang, Baharan Mirzasoleiman, Cho-Jui Hsieh, and Jason Cong. Nessa: Near-storage data selection for accelerated machine learning training. In Proceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems, HotStorage '23, page 8--15, New York, NY, USA, 2023. Association for Computing Machinery.

Digital Library

[23]

Erik Reidel, Garth Gibson, and Christos Faloutsos. Active storage for large-scale data mining and multimedia applications. 1998.

[24]

Marco Serafini and Hui Guan. Scalable graph neural network training: The case for sampling. ACM SIGOPS Operating Systems Review, 55(1):68--76, 2021.

Digital Library

[25]

Devesh Tiwari, Simona Boboila, Sudharshan Vazhkudai, Youngjae Kim, Xiaosong Ma, Peter Desnoyers, and Yan Solihin. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines. In 11th USENIX Conference on File and Storage Technologies (FAST 13), pages 119--132, 2013.

[26]

Roger Waleffe, Jason Mohoney, Theodoros Rekatsinas, and Shivaram Venkataraman. Mariusgnn: Resource-efficient out-of-core training of graph neural networks. In Proceedings of the Eighteenth European Conference on Computer Systems, pages 144--161, 2023.

Digital Library

[27]

Lei Wang, Qiang Yin, Chao Tian, Jianbang Yang, Rong Chen, Wenyuan Yu, Zihang Yao, and Jingren Zhou. Flexgraph: a flexible and efficient distributed framework for gnn training. In Proceedings of the Sixteenth European Conference on Computer Systems, pages 67--82, 2021.

Digital Library

[28]

Qiange Wang, Yanfeng Zhang, Hao Wang, Chaoyi Chen, Xiaodong Zhang, and Ge Yu. Neutronstar: Distributed gnn training with hybrid dependency management. In Proceedings of the 2022 International Conference on Management of Data, pages 1301--1315, 2022.

Digital Library

[29]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.

[30]

Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen, Wenyuan Yu, and Jingren Zhou. Gnnlab: a factored system for sample-based gnn training over gpus. In Proceedings of the Seventeenth European Conference on Computer Systems, pages 417--434, 2022.

Digital Library

[31]

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '18, pages 974--983, New York, NY, USA, 2018. ACM.

[32]

Dalong Zhang, Xin Huang, Ziqi Liu, Jun Zhou, Zhiyang Hu, Xianzheng Song, Zhibang Ge, Lin Wang, Zhiqiang Zhang, and Yuan Qi. Agl: A scalable system for industrial-purpose graph machine learning. Proceedings of the VLDB Endowment, 13(12).

[33]

Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. An end-to-end deep learning architecture for graph classification. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.

[34]

Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. Bytegnn: Efficient graph neural network training at large scale. VLDB, 15(6):1228--1242, 2022.

Digital Library

[35]

Da Zheng, Xiang Song, Chengru Yang, Dominique LaSalle, and George Karypis. Distributed hybrid cpu and gpu training for graph neural networks on billion-scale heterogeneous graphs. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4582--4591, 2022.

Digital Library

Index Terms

In situ neighborhood sampling for large-scale GNN training
1. Computing methodologies
  1. Machine learning
2. Hardware
  1. Emerging technologies

Recommendations

HongTu: Scalable Full-Graph GNN Training on Multiple GPUs
PACMMOD

Full-graph training on graph neural networks (GNN) has emerged as a promising training method for its effectiveness. Full-graph training requires extensive memory and computation resources. To accelerate this training process, researchers have proposed ...
ADGNN: Towards Scalable GNN Training with Aggregation-Difference Aware Sampling
PACMMOD

Distributed computing is promising to enable large-scale graph neural network (GNN) model training. However, care is needed to avoid excessive computational and communication overheads. Sampling is promising in terms of enabling scalability, and sampling ...
A Unified CPU-GPU Protocol for GNN Training
CF '24: Proceedings of the 21st ACM International Conference on Computing Frontiers

Training a Graph Neural Network (GNN) model on large-scale graphs involves a high volume of data communication and computations. While state-of-the-art CPUs and GPUs feature high computing power, the Standard GNN training protocol adopted in existing GNN ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

DaMoN '24: Proceedings of the 20th International Workshop on Data Management on New Hardware

June 2024

123 pages

ISBN:9798400706677

DOI:10.1145/3662010

Editors:
Carsten Binnig
TU Darmstadt, Germany
,
Nesime Tatbul
Intel Labs and MIT, USA

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2024

Check for updates

Author Tags

Qualifiers

Short-paper
Research
Refereed limited

Funding Sources

National Science Foundation

Conference

SIGMOD/PODS '24

Sponsor:

SIGMOD

SIGMOD/PODS '24: International Conference on Management of Data

June 10, 2024

AA, Santiago, Chile

Acceptance Rates

DaMoN '24 Paper Acceptance Rate 14 of 25 submissions, 56%;

Overall Acceptance Rate 94 of 127 submissions, 74%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
360
Total Downloads

Downloads (Last 12 months)360
Downloads (Last 6 weeks)54

Reflects downloads up to 16 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents