research-article

AliGraph: a comprehensive graph neural network platform

Authors:

Jingren ZhouAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 12, Issue 12

Pages 2094 - 2105

https://doi.org/10.14778/3352063.3352127

Published: 01 August 2019 Publication History

Abstract

An increasing number of machine learning tasks require dealing with large graph datasets, which capture rich and complex relationship among potentially billions of elements. Graph Neural Network (GNN) becomes an effective way to address the graph learning problem by converting the graph data into a low dimensional space while keeping both the structural and property information to the maximum extent and constructing a neural network for training and referencing. However, it is challenging to provide an efficient graph storage and computation capabilities to facilitate GNN training and enable development of new GNN algorithms. In this paper, we present a comprehensive graph neural network system, namely AliGraph, which consists of distributed graph storage, optimized sampling operators and runtime to efficiently support not only existing popular GNNs but also a series of in-house developed ones for different scenarios. The system is currently deployed at Alibaba to support a variety of business scenarios, including product recommendation and personalized search at Alibaba's E-Commerce platform. By conducting extensive experiments on a real-world dataset with 492.90 million vertices, 6.82 billion edges and rich attributes, AliGraph performs an order of magnitude faster in terms of graph building (5 minutes vs hours reported from the state-of-the-art PowerGraph platform). At training, AliGraph runs 40%-50% faster with the novel caching strategy and demonstrates around 12 times speed up with the improved runtime. In addition, our in-house developed GNN models all showcase their statistically significant superiorities in terms of both effectiveness and efficiency (e.g., 4.12%--17.19% lift by F1 scores).

References

[1]

P. Battaglia, R. Pascanu, M. Lai, and D. J. Rezende. Interaction networks for learning about objects, relations and physics. In NIPS, pages 4502--4510, 2016.

Digital Library

[2]

S. Bhagat, G. Cormode, and S. Muthukrishnan. Node classification in social networks. Computer Science, 16(3):115--148, 2011.

[3]

E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2d graph partitioning. 2013.

[4]

U. Brandes, M. Gaertler, and D. Wagner. Experiments on graph clustering algorithms. LNCS, 2832:568--579, 2003.

[5]

H. Cai, V. W. Zheng, C. C. Chang, H. Cai, V. W. Zheng, and C. C. Chang. A comprehensive survey of graph embedding: Problems, techniques and applications. TKDE, 30(9):1616--1637, 2017.

[6]

S. Chang, W. Han, J. Tang, G.-J. Qi, C. C. Aggarwal, and T. S. Huang. Heterogeneous network embedding via deep architectures. In KDD, pages 119--128, 2015.

Digital Library

[7]

Cen, Y., Zou, X., Zhang, J., Yang, H., Zhou, J., Tang, J. Representation Learning for Attributed Multiplex Heterogeneous Network. In KDD, 2019.

Digital Library

[8]

Liu, N., Tan, Q., Li, Y., Yang, H., Zhou, J., Hu, X. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. In KDD, 2019.

Digital Library

[9]

Li, C., Shen, D., Jia, K., Yang, H. Hierarchical Representation Learning for Bipartite Graphs. In IJCAI, 2019.

[10]

Zhao, Y., Wang, X., Yang, H., Song, L. and Tang, J. Large Scale Evolving Graphs with Burst Detection. In IJCAI, 2019.

[11]

J. Chen, T. Ma, and C. Xiao. Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv.1801.10247, 2018.

[12]

M. Chrobak and J. Noga. Lru is better than fifo. In Acm-siam Symposium on Discrete Algorithms, 1998.

Digital Library

[13]

P. Cui, X. Wang, J. Pei, and W. Zhu. A survey on network embedding. TKDE, 2018.

[14]

Y. Dong, N. V. Chawla, and A. Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In KDD, pages 135--144, 2017.

Digital Library

[15]

L. Du, Y. Wang, G. Song, Z. Lu, and J. Wang. Dynamic network embedding: An extended approach for skip-gram based network embedding. In IJCAI, pages 2086--2092, 2018.

Digital Library

[16]

A. G. Duran and M. Niepert. Learning graph representations with embedding propagation. In Advances in Neural Information Processing Systems, pages 5119--5130, 2017.

[17]

W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, and C. Tian. Parallelizing sequential graph computations. In SIGMOD, pages 495--510, 2017.

Digital Library

[18]

A. Fout, J. Byrd, B. Shariat, and A. Ben-Hur. Protein interface prediction using graph convolutional networks. In NIPS, pages 6530--6539, 2017.

Digital Library

[19]

H. Gao and H. Huang. Deep attributed network embedding. In IJCAI, pages 3364--3370, 2018.

Digital Library

[20]

J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, 2012.

Digital Library

[21]

P. Goyal and E. Ferrara. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 2018.

[22]

A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In KDD, pages 855--864, 2016.

Digital Library

[23]

T. Hamaguchi, H. Oiwa, M. Shimbo, and Y. Matsumoto. Knowledge transfer for out-of-knowledge-base entities : A graph neural network approach. In IJCAI, 2017.

Digital Library

[24]

W. L. Hamilton, R. Ying, and J. Leskovec. Representation learning on graphs: Methods and applications. 2017.

[25]

W. L. Hamilton, Z. Ying, and J. Leskovec. Inductive representation learning on large graphs. In NIPS, pages 1025--1035, 2017.

Digital Library

[26]

R. He and J. McAuley. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In WWW, pages 507--517.

Digital Library

[27]

X. Huang, J. Li, and X. Hu. Accelerated attributed network embedding. In SDM, pages 633--641. SIAM, 2017.

[28]

X. Huang, J. Li, and X. Hu. Label informed attributed network embedding. In WSDM, pages 731--739, 2017.

Digital Library

[29]

D. R. Hush and J. M. Salas. Improving the learning rate of back-propagation with the gradient reuse algorithm. In IEEE International Conference on Neural Networks, 1988.

[30]

G. Karypis and V. Kumar. Metis-unstructured graph partitioning and sparse matrix ordering system. Technical Report.

[31]

D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv.1312.6114, 2013.

[32]

T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.

[33]

Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. In Nature, pages 521--436. 2015.

[34]

J. Li, H. Dani, X. Hu, J. Tang, Y. Chang, and H. Liu. Attributed network embedding for learning in a dynamic environment. In CIKM, pages 387--396. ACM, 2017.

Digital Library

[35]

Li, C., Shen, D., Jia, K., Yang, H. Hierarchical Representation Learning for Bipartite Graphs. In IJCAI, 2019.

[36]

D. Liang, R. G. Krishnan, M. D. Hoffman, and T. Jebara. Variational autoencoders for collaborative filtering. 2018.

[37]

L. Liao, X. He, H. Zhang, and T.-S. Chua. Attributed social network embedding. TKDE, 30(12):2257--2270, 2018.

Digital Library

[38]

D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. 2003.

[39]

Z. Lin, M. Feng, C. N. d. Santos, M. Yu, B. Xiang, B. Zhou, and Y. Bengio. A structured self-attentive sentence embedding. arXiv:1703.03130, 2017.

[40]

W. Liu, P.-Y. Chen, S. Yeung, T. Suzumura, and L. Chen. Principled multilayer network embedding. In ICDM, pages 134--141. IEEE, 2017.

[41]

Liu, N., Tan, Q., Li, Y., Yang, H., Zhou, J., Hu, X. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. In KDD, 2019.

Digital Library

[42]

J. McAuley, C. Targett, Q. Shi, and A. Van Den Hengel. Image-based recommendations on styles and substitutes. In SIGIR, pages 43--52. ACM, 2015.

Digital Library

[43]

B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In KDD, pages 701--710. ACM, 2014.

Digital Library

[44]

J. Qiu, Y. Dong, H. Ma, J. Li, K. Wang, and J. Tang. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM, pages 459--467, 2018.

Digital Library

[45]

M. Qu, J. Tang, J. Shang, X. Ren, M. Zhang, and J. Han. An attention-based collaboration framework for multi-view network representation learning. In CIKM, pages 1767--1776. ACM, 2017.

Digital Library

[46]

L. F. R. Ribeiro, P. H. P. Saverese, and D. R. Figueiredo. struc2vec : Learning node representations from structural identity. 2017.

[47]

C. Shi, B. Hu, X. Zhao, and P. Yu. Heterogeneous information network embedding for recommendation. TKDE, 2018.

Digital Library

[48]

I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, 2013.

Digital Library

[49]

J. Tang, M. Qu, and Q. Mei. Pte: Predictive text embedding through large-scale heterogeneous text networks. In KDD, pages 1165--1174. ACM, 2015.

Digital Library

[50]

J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In WWW, pages 1067--1077, 2015.

Digital Library

[51]

S. Tanimoto. Power laws of the in-degree and out-degree distributions of complex networks. Physics, 2009.

[52]

P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, 2008.

Digital Library

[53]

D. Wang, P. Cui, and W. Zhu. Structural deep network embedding. In KDD, pages 1225--1234, 2016.

Digital Library

[54]

Z. Wang, Y. Tan, and Z. Ming. Graph-based recommendation on social networks. In APWeb, 2010.

Digital Library

[55]

W. Xiong, M. Yu, S. Chang, X. Guo, and W. Y. Wang. One-shot relational learning for knowledge graphs. In EMNLP, pages 1980--1990, 2018.

[56]

C. Yang, Z. Liu, D. Zhao, M. Sun, and E. Y. Chang. Network representation learning with rich text information. In IJCAI, 2015.

Digital Library

[57]

H. Zhang, L. Qiu, L. Yi, and Y. Song. Scalable multiplex network embedding. In IJCAI, pages 3082--3088, 2018.

Digital Library

[58]

Z. Zhang, H. Yang, J. Bu, S. Zhou, P. Yu, J. Zhang, M. Ester, and C. Wang. Anrl: Attributed network representation learning via deep neural networks. In IJCAI, pages 3155--3161, 2018.

Digital Library

[59]

V. W. Zheng, M. Sha, Y. Li, H. Yang, Z. Zhang, and K.-L. Tan. Heterogeneous embedding propagation for large-scale e-commerce user alignment. In ICDM, 2018.

[60]

L. Zhou, Y. Yang, X. Ren, F. Wu, and Y. Zhuang. Dynamic network embedding by modeling triadic closure process. In AAAI, 2018.

[61]

Zhao, Y., Wang, X., Yang, H., Song, L. and Tang, J. Large Scale Evolving Graphs with Burst Detection. In IJCAI, 2019.

Cited By

Jiang JHuang HZheng ZWei YFu FLi XCui B(2025)Detecting and Analyzing Motifs in Large-Scale Online Transaction NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.351113637:2(584-596)Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3511136
Fang PLi ZKhan ALuo SWang FShi ZFeng D(2025)Information-Oriented Random Walks and Pipeline Optimization for Distributed Graph EmbeddingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.342433337:1(408-422)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3424333
Kose HNunez-Yanez JPiechocki RPope J(2024)A Survey of Computationally Efficient Graph Neural Networks for Reconfigurable SystemsInformation10.3390/info1507037715:7(377)Online publication date: 28-Jun-2024
https://doi.org/10.3390/info15070377
Show More Cited By

Recommendations

AliGraph: A Comprehensive Graph Neural Network Platform
KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

An increasing number of machine learning tasks require dealing with large graph datasets, which capture rich and complex relation- ship among potentially billions of elements. Graph Neural Network (GNN) becomes an effective way to address the graph ...
On the Multichromatic Number of s-Stable Kneser Graphs

For positive integers n and s, a subset Sï [n] is s-stable if sï |i-j|ï n-s for distinct i,j∈S . The s-stable r-uniform Kneser hypergraph KGrn,ks-stable is the r-uniform hypergraph that has the collection of all s-stable k-element subsets of [n] as ...
Adjacent vertex-distinguishing edge and total chromatic numbers of hypercubes

An adjacent vertex-distinguishing edge coloring of a simple graph G is a proper edge coloring of G such that incident edge sets of any two adjacent vertices are assigned different sets of colors. A total coloring of a graph G is a coloring of both the ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 12, Issue 12

August 2019

547 pages

ISSN:2150-8097

Editors:
Lei Chen,
Fatma Özcan

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2019

Published in PVLDB Volume 12, Issue 12

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

171
Total Citations
View Citations
535
Total Downloads

Downloads (Last 12 months)66
Downloads (Last 6 weeks)3

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jiang JHuang HZheng ZWei YFu FLi XCui B(2025)Detecting and Analyzing Motifs in Large-Scale Online Transaction NetworksIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.351113637:2(584-596)Online publication date: 1-Feb-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3511136
Fang PLi ZKhan ALuo SWang FShi ZFeng D(2025)Information-Oriented Random Walks and Pipeline Optimization for Distributed Graph EmbeddingIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.342433337:1(408-422)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1109/TKDE.2024.3424333
Kose HNunez-Yanez JPiechocki RPope J(2024)A Survey of Computationally Efficient Graph Neural Networks for Reconfigurable SystemsInformation10.3390/info1507037715:7(377)Online publication date: 28-Jun-2024
https://doi.org/10.3390/info15070377
Lee CHewes VCerati GWang KAurisano AAgrawal AChoudhary ALiao W(2024)Addressing GPU memory limitations for Graph Neural Networks in High-Energy Physics applicationsFrontiers in High Performance Computing10.3389/fhpcp.2024.14586742Online publication date: 18-Sep-2024
https://doi.org/10.3389/fhpcp.2024.1458674
Akturan A(2024)Yapay Zekânın İşletme Yönetimi ve Liderlik Üzerindeki Etkileri: Bir Literatür İncelemesiSinop Üniversitesi Sosyal Bilimler Dergisi10.30561/sinopusd.15548568:2(1305-1348)Online publication date: 30-Nov-2024
https://doi.org/10.30561/sinopusd.1554856
Chen SLiu JShen L(2024)A Survey on Graph Neural Network Acceleration: A Hardware PerspectiveChinese Journal of Electronics10.23919/cje.2023.00.13533:3(601-622)Online publication date: May-2024
https://doi.org/10.23919/cje.2023.00.135
Hang JHong ZFeng XWang GCao DQiao JWang HZhang D(2024)Complex-Path: Effective and Efficient Node Ranking with Paths in Billion-Scale Heterogeneous GraphsProceedings of the VLDB Endowment10.14778/3685800.368582017:12(3973-3986)Online publication date: 8-Nov-2024
https://dl.acm.org/doi/10.14778/3685800.3685820
Sheng ZZhang WTao YCui B(2024)OUTRE: An OUT-of-Core De-REdundancy GNN Training Framework for Massive Graphs within A Single MachineProceedings of the VLDB Endowment10.14778/3681954.368197617:11(2960-2973)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.14778/3681954.3681976
Guliyev RHaldar AFerhatosmanoglu H(2024)D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural NetworksProceedings of the VLDB Endowment10.14778/3681954.368196117:11(2764-2777)Online publication date: 30-Aug-2024
https://dl.acm.org/doi/10.14778/3681954.3681961
Wang KXu YLuo S(2024)TIGER: Training Inductive Graph Neural Network for Large-Scale Knowledge Graph ReasoningProceedings of the VLDB Endowment10.14778/3675034.367503917:10(2459-2472)Online publication date: 6-Aug-2024
https://dl.acm.org/doi/10.14778/3675034.3675039
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents