skip to main content
10.1145/3662010.3663443acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper
Open access

In situ neighborhood sampling for large-scale GNN training

Published: 09 June 2024 Publication History

Abstract

Graph Neural Network (GNN) training algorithms commonly perform neighborhood sampling to construct fixed-size mini-batches for weight aggregation on GPUs. State-of-the-art disk-based GNN frameworks compute sampling on the CPU, transferring edge partitions from disk to memory for every mini-batch. We argue that this design incurs significant waste of PCIe bandwidth, as entire neighborhoods are transferred to main memory only to be discarded after sampling. In this paper, we make the first step towards an inherently different approach that harnesses near-storage compute technology to achieve efficient large-scale GNN training. We target a single machine with one or more SmartSSD devices and develop a high-throughput, epoch-wide sampling FPGA kernel that enables pipelining across epochs. When compared to a baseline random-access sampling kernel, our solution achieves up to 4.26× lower sampling time per epoch.

References

[1]
Deep graph library. https://www.dgl.ai. Last access: March 2024.
[2]
Nvidia gpudirect storage. https://docs.nvidia.com/gpudirect-storage/configuration-guide/index.html. Last access: March 2024.
[3]
Open graph benchmark. https://ogb.stanford.edu. Last access: March 2024.
[4]
Size of the world wide web. https://www.worldwidewebsize.com. Last access: March 2024.
[5]
Webscope graph and social data. https://webscope.sandbox.yahoo.com/catalog.php?datatype=g. Last access: March 2024.
[6]
An introduction to brain networks. In Alex Fornito, Andrew Zalesky, and Edward T. Bullmore, editors, Fundamentals of Brain Network Analysis, pages 1--35. Academic Press, San Diego, 2016.
[7]
Avery Ching, Sergey Edunov, Maja Kabiljo, Dionysios Logothetis, and Sambavi Muthukrishnan. One trillion edges: Graph processing at facebook-scale. Proceedings of the VLDB Endowment, 8(12):1804--1815, 2015.
[8]
Sangyeun Cho, Chanik Park, Hyunok Oh, Sungchan Kim, Youngmin Yi, and Gregory R Ganger. Active disk meets flash: A case for intelligent ssds. In Proceedings of the 27th international ACM conference on International conference on supercomputing, pages 91--102, 2013.
[9]
Swapnil Gandhi and Anand Padmanabha Iyer. P3: Distributed deep graph learning at scale. In 15th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 21), pages 551--568, 2021.
[10]
Hongyang Gao, Zhengyang Wang, and Shuiwang Ji. Large-scale learnable graph convolutional networks. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1416--1424, 2018.
[11]
Ping Gong, Renjie Liu, Zunyao Mao, Zhenkun Cai, Xiao Yan, Cheng Li, Minjie Wang, and Zhuozhao Li. gsampler: General and efficient gpu-based graph sampling for graph learning. In Proceedings of the 29th Symposium on Operating Systems Principles, SOSP '23, page 562--578, New York, NY, USA, 2023. Association for Computing Machinery.
[12]
Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. Advances in neural information processing systems, 30, 2017.
[13]
Wenbing Huang, Tong Zhang, Yu Rong, and Junzhou Huang. Adaptive sampling towards fast graph representation learning. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 4558--4567. 2018.
[14]
Abhinav Jangda, Sandeep Polisetty, Arjun Guha, and Marco Serafini. Accelerating graph sampling for graph machine learning using gpus. In Proceedings of the Sixteenth European Conference on Computer Systems, EuroSys '21, page 311--326, New York, NY, USA, 2021. Association for Computing Machinery.
[15]
Kimberly Keeton, David A Patterson, and Joseph M Hellerstein. A case for intelligent disks (idisks). Acm Sigmod Record, 27(3):42--52, 1998.
[16]
Joo Hwan Lee, Hui Zhang, Veronica Lagrange, Praveen Krishnamoorthy, Xiaodong Zhao, and Yang Seok Ki. Smartssd: Fpga accelerated near-storage data analytics on ssd. IEEE Computer architecture letters, 19(2):110--113, 2020.
[17]
Yunjae Lee, Jinha Chung, and Minsoo Rhu. Smartsage: training large-scale graph neural networks using in-storage processing architectures. In Proceedings of the 49th Annual International Symposium on Computer Architecture, ISCA '22, page 932--945, New York, NY, USA, 2022. Association for Computing Machinery.
[18]
Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, and Yinlong Xu. Pagraph: Scaling gnn training on large graphs via computation-aware caching. In Proceedings of the 11th ACM Symposium on Cloud Computing, pages 401--415, 2020.
[19]
Yao Ma, Suhang Wang, Charu C Aggarwal, and Jiliang Tang. Graph convolutional networks with eigenpooling. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 723--731, 2019.
[20]
Jeongmin Brian Park, Vikram Sharma Mailthody, Zaid Qureshi, and Wen-mei Hwu. Accelerating sampling and aggregation operations in gnn frameworks with gpu initiated direct storage accesses. arXiv preprint arXiv:2306.16384, 2023.
[21]
Yeonhong Park, Sunhong Min, and Jae W Lee. Ginex: Ssd-enabled billion-scale graph neural network training on a single machine via provably optimal in-memory caching. arXiv preprint arXiv:2208.09151, 2022.
[22]
Neha Prakriya, Yu Yang, Baharan Mirzasoleiman, Cho-Jui Hsieh, and Jason Cong. Nessa: Near-storage data selection for accelerated machine learning training. In Proceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems, HotStorage '23, page 8--15, New York, NY, USA, 2023. Association for Computing Machinery.
[23]
Erik Reidel, Garth Gibson, and Christos Faloutsos. Active storage for large-scale data mining and multimedia applications. 1998.
[24]
Marco Serafini and Hui Guan. Scalable graph neural network training: The case for sampling. ACM SIGOPS Operating Systems Review, 55(1):68--76, 2021.
[25]
Devesh Tiwari, Simona Boboila, Sudharshan Vazhkudai, Youngjae Kim, Xiaosong Ma, Peter Desnoyers, and Yan Solihin. Active flash: Towards energy-efficient, in-situ data analytics on extreme-scale machines. In 11th USENIX Conference on File and Storage Technologies (FAST 13), pages 119--132, 2013.
[26]
Roger Waleffe, Jason Mohoney, Theodoros Rekatsinas, and Shivaram Venkataraman. Mariusgnn: Resource-efficient out-of-core training of graph neural networks. In Proceedings of the Eighteenth European Conference on Computer Systems, pages 144--161, 2023.
[27]
Lei Wang, Qiang Yin, Chao Tian, Jianbang Yang, Rong Chen, Wenyuan Yu, Zihang Yao, and Jingren Zhou. Flexgraph: a flexible and efficient distributed framework for gnn training. In Proceedings of the Sixteenth European Conference on Computer Systems, pages 67--82, 2021.
[28]
Qiange Wang, Yanfeng Zhang, Hao Wang, Chaoyi Chen, Xiaodong Zhang, and Ge Yu. Neutronstar: Distributed gnn training with hybrid dependency management. In Proceedings of the 2022 International Conference on Management of Data, pages 1301--1315, 2022.
[29]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
[30]
Jianbang Yang, Dahai Tang, Xiaoniu Song, Lei Wang, Qiang Yin, Rong Chen, Wenyuan Yu, and Jingren Zhou. Gnnlab: a factored system for sample-based gnn training over gpus. In Proceedings of the Seventeenth European Conference on Computer Systems, pages 417--434, 2022.
[31]
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L Hamilton, and Jure Leskovec. Graph Convolutional Neural Networks for Web-Scale Recommender Systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '18, pages 974--983, New York, NY, USA, 2018. ACM.
[32]
Dalong Zhang, Xin Huang, Ziqi Liu, Jun Zhou, Zhiyang Hu, Xianzheng Song, Zhibang Ge, Lin Wang, Zhiqiang Zhang, and Yuan Qi. Agl: A scalable system for industrial-purpose graph machine learning. Proceedings of the VLDB Endowment, 13(12).
[33]
Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. An end-to-end deep learning architecture for graph classification. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
[34]
Chenguang Zheng, Hongzhi Chen, Yuxuan Cheng, Zhezheng Song, Yifan Wu, Changji Li, James Cheng, Hao Yang, and Shuai Zhang. Bytegnn: Efficient graph neural network training at large scale. VLDB, 15(6):1228--1242, 2022.
[35]
Da Zheng, Xiang Song, Chengru Yang, Dominique LaSalle, and George Karypis. Distributed hybrid cpu and gpu training for graph neural networks on billion-scale heterogeneous graphs. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4582--4591, 2022.

Index Terms

  1. In situ neighborhood sampling for large-scale GNN training

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      DaMoN '24: Proceedings of the 20th International Workshop on Data Management on New Hardware
      June 2024
      123 pages
      ISBN:9798400706677
      DOI:10.1145/3662010
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 09 June 2024

      Check for updates

      Author Tags

      1. GNN training
      2. SmartSSD
      3. graph neural networks

      Qualifiers

      • Short-paper
      • Research
      • Refereed limited

      Funding Sources

      Conference

      SIGMOD/PODS '24
      Sponsor:

      Acceptance Rates

      DaMoN '24 Paper Acceptance Rate 14 of 25 submissions, 56%;
      Overall Acceptance Rate 94 of 127 submissions, 74%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 360
        Total Downloads
      • Downloads (Last 12 months)360
      • Downloads (Last 6 weeks)54
      Reflects downloads up to 16 Jan 2025

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media