skip to main content
research-article

AliGraph: a comprehensive graph neural network platform

Published: 01 August 2019 Publication History

Abstract

An increasing number of machine learning tasks require dealing with large graph datasets, which capture rich and complex relationship among potentially billions of elements. Graph Neural Network (GNN) becomes an effective way to address the graph learning problem by converting the graph data into a low dimensional space while keeping both the structural and property information to the maximum extent and constructing a neural network for training and referencing. However, it is challenging to provide an efficient graph storage and computation capabilities to facilitate GNN training and enable development of new GNN algorithms. In this paper, we present a comprehensive graph neural network system, namely AliGraph, which consists of distributed graph storage, optimized sampling operators and runtime to efficiently support not only existing popular GNNs but also a series of in-house developed ones for different scenarios. The system is currently deployed at Alibaba to support a variety of business scenarios, including product recommendation and personalized search at Alibaba's E-Commerce platform. By conducting extensive experiments on a real-world dataset with 492.90 million vertices, 6.82 billion edges and rich attributes, AliGraph performs an order of magnitude faster in terms of graph building (5 minutes vs hours reported from the state-of-the-art PowerGraph platform). At training, AliGraph runs 40%-50% faster with the novel caching strategy and demonstrates around 12 times speed up with the improved runtime. In addition, our in-house developed GNN models all showcase their statistically significant superiorities in terms of both effectiveness and efficiency (e.g., 4.12%--17.19% lift by F1 scores).

References

[1]
P. Battaglia, R. Pascanu, M. Lai, and D. J. Rezende. Interaction networks for learning about objects, relations and physics. In NIPS, pages 4502--4510, 2016.
[2]
S. Bhagat, G. Cormode, and S. Muthukrishnan. Node classification in social networks. Computer Science, 16(3):115--148, 2011.
[3]
E. G. Boman, K. D. Devine, and S. Rajamanickam. Scalable matrix computations on large scale-free graphs using 2d graph partitioning. 2013.
[4]
U. Brandes, M. Gaertler, and D. Wagner. Experiments on graph clustering algorithms. LNCS, 2832:568--579, 2003.
[5]
H. Cai, V. W. Zheng, C. C. Chang, H. Cai, V. W. Zheng, and C. C. Chang. A comprehensive survey of graph embedding: Problems, techniques and applications. TKDE, 30(9):1616--1637, 2017.
[6]
S. Chang, W. Han, J. Tang, G.-J. Qi, C. C. Aggarwal, and T. S. Huang. Heterogeneous network embedding via deep architectures. In KDD, pages 119--128, 2015.
[7]
Cen, Y., Zou, X., Zhang, J., Yang, H., Zhou, J., Tang, J. Representation Learning for Attributed Multiplex Heterogeneous Network. In KDD, 2019.
[8]
Liu, N., Tan, Q., Li, Y., Yang, H., Zhou, J., Hu, X. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. In KDD, 2019.
[9]
Li, C., Shen, D., Jia, K., Yang, H. Hierarchical Representation Learning for Bipartite Graphs. In IJCAI, 2019.
[10]
Zhao, Y., Wang, X., Yang, H., Song, L. and Tang, J. Large Scale Evolving Graphs with Burst Detection. In IJCAI, 2019.
[11]
J. Chen, T. Ma, and C. Xiao. Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv.1801.10247, 2018.
[12]
M. Chrobak and J. Noga. Lru is better than fifo. In Acm-siam Symposium on Discrete Algorithms, 1998.
[13]
P. Cui, X. Wang, J. Pei, and W. Zhu. A survey on network embedding. TKDE, 2018.
[14]
Y. Dong, N. V. Chawla, and A. Swami. metapath2vec: Scalable representation learning for heterogeneous networks. In KDD, pages 135--144, 2017.
[15]
L. Du, Y. Wang, G. Song, Z. Lu, and J. Wang. Dynamic network embedding: An extended approach for skip-gram based network embedding. In IJCAI, pages 2086--2092, 2018.
[16]
A. G. Duran and M. Niepert. Learning graph representations with embedding propagation. In Advances in Neural Information Processing Systems, pages 5119--5130, 2017.
[17]
W. Fan, J. Xu, Y. Wu, W. Yu, J. Jiang, Z. Zheng, B. Zhang, Y. Cao, and C. Tian. Parallelizing sequential graph computations. In SIGMOD, pages 495--510, 2017.
[18]
A. Fout, J. Byrd, B. Shariat, and A. Ben-Hur. Protein interface prediction using graph convolutional networks. In NIPS, pages 6530--6539, 2017.
[19]
H. Gao and H. Huang. Deep attributed network embedding. In IJCAI, pages 3364--3370, 2018.
[20]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. Powergraph: Distributed graph-parallel computation on natural graphs. In OSDI, 2012.
[21]
P. Goyal and E. Ferrara. Graph embedding techniques, applications, and performance: A survey. Knowledge-Based Systems, 2018.
[22]
A. Grover and J. Leskovec. node2vec: Scalable feature learning for networks. In KDD, pages 855--864, 2016.
[23]
T. Hamaguchi, H. Oiwa, M. Shimbo, and Y. Matsumoto. Knowledge transfer for out-of-knowledge-base entities : A graph neural network approach. In IJCAI, 2017.
[24]
W. L. Hamilton, R. Ying, and J. Leskovec. Representation learning on graphs: Methods and applications. 2017.
[25]
W. L. Hamilton, Z. Ying, and J. Leskovec. Inductive representation learning on large graphs. In NIPS, pages 1025--1035, 2017.
[26]
R. He and J. McAuley. Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering. In WWW, pages 507--517.
[27]
X. Huang, J. Li, and X. Hu. Accelerated attributed network embedding. In SDM, pages 633--641. SIAM, 2017.
[28]
X. Huang, J. Li, and X. Hu. Label informed attributed network embedding. In WSDM, pages 731--739, 2017.
[29]
D. R. Hush and J. M. Salas. Improving the learning rate of back-propagation with the gradient reuse algorithm. In IEEE International Conference on Neural Networks, 1988.
[30]
G. Karypis and V. Kumar. Metis-unstructured graph partitioning and sparse matrix ordering system. Technical Report.
[31]
D. P. Kingma and M. Welling. Auto-encoding variational bayes. arXiv.1312.6114, 2013.
[32]
T. N. Kipf and M. Welling. Semi-supervised classification with graph convolutional networks. In ICLR, 2017.
[33]
Y. LeCun, Y. Bengio, and G. Hinton. Deep learning. In Nature, pages 521--436. 2015.
[34]
J. Li, H. Dani, X. Hu, J. Tang, Y. Chang, and H. Liu. Attributed network embedding for learning in a dynamic environment. In CIKM, pages 387--396. ACM, 2017.
[35]
Li, C., Shen, D., Jia, K., Yang, H. Hierarchical Representation Learning for Bipartite Graphs. In IJCAI, 2019.
[36]
D. Liang, R. G. Krishnan, M. D. Hoffman, and T. Jebara. Variational autoencoders for collaborative filtering. 2018.
[37]
L. Liao, X. He, H. Zhang, and T.-S. Chua. Attributed social network embedding. TKDE, 30(12):2257--2270, 2018.
[38]
D. Liben-Nowell and J. Kleinberg. The link prediction problem for social networks. 2003.
[39]
Z. Lin, M. Feng, C. N. d. Santos, M. Yu, B. Xiang, B. Zhou, and Y. Bengio. A structured self-attentive sentence embedding. arXiv:1703.03130, 2017.
[40]
W. Liu, P.-Y. Chen, S. Yeung, T. Suzumura, and L. Chen. Principled multilayer network embedding. In ICDM, pages 134--141. IEEE, 2017.
[41]
Liu, N., Tan, Q., Li, Y., Yang, H., Zhou, J., Hu, X. Is a Single Vector Enough? Exploring Node Polysemy for Network Embedding. In KDD, 2019.
[42]
J. McAuley, C. Targett, Q. Shi, and A. Van Den Hengel. Image-based recommendations on styles and substitutes. In SIGIR, pages 43--52. ACM, 2015.
[43]
B. Perozzi, R. Al-Rfou, and S. Skiena. Deepwalk: Online learning of social representations. In KDD, pages 701--710. ACM, 2014.
[44]
J. Qiu, Y. Dong, H. Ma, J. Li, K. Wang, and J. Tang. Network embedding as matrix factorization: Unifying deepwalk, line, pte, and node2vec. In WSDM, pages 459--467, 2018.
[45]
M. Qu, J. Tang, J. Shang, X. Ren, M. Zhang, and J. Han. An attention-based collaboration framework for multi-view network representation learning. In CIKM, pages 1767--1776. ACM, 2017.
[46]
L. F. R. Ribeiro, P. H. P. Saverese, and D. R. Figueiredo. struc2vec : Learning node representations from structural identity. 2017.
[47]
C. Shi, B. Hu, X. Zhao, and P. Yu. Heterogeneous information network embedding for recommendation. TKDE, 2018.
[48]
I. Stanton and G. Kliot. Streaming graph partitioning for large distributed graphs. In KDD, 2013.
[49]
J. Tang, M. Qu, and Q. Mei. Pte: Predictive text embedding through large-scale heterogeneous text networks. In KDD, pages 1165--1174. ACM, 2015.
[50]
J. Tang, M. Qu, M. Wang, M. Zhang, J. Yan, and Q. Mei. Line: Large-scale information network embedding. In WWW, pages 1067--1077, 2015.
[51]
S. Tanimoto. Power laws of the in-degree and out-degree distributions of complex networks. Physics, 2009.
[52]
P. Vincent, H. Larochelle, Y. Bengio, and P. A. Manzagol. Extracting and composing robust features with denoising autoencoders. In ICML, 2008.
[53]
D. Wang, P. Cui, and W. Zhu. Structural deep network embedding. In KDD, pages 1225--1234, 2016.
[54]
Z. Wang, Y. Tan, and Z. Ming. Graph-based recommendation on social networks. In APWeb, 2010.
[55]
W. Xiong, M. Yu, S. Chang, X. Guo, and W. Y. Wang. One-shot relational learning for knowledge graphs. In EMNLP, pages 1980--1990, 2018.
[56]
C. Yang, Z. Liu, D. Zhao, M. Sun, and E. Y. Chang. Network representation learning with rich text information. In IJCAI, 2015.
[57]
H. Zhang, L. Qiu, L. Yi, and Y. Song. Scalable multiplex network embedding. In IJCAI, pages 3082--3088, 2018.
[58]
Z. Zhang, H. Yang, J. Bu, S. Zhou, P. Yu, J. Zhang, M. Ester, and C. Wang. Anrl: Attributed network representation learning via deep neural networks. In IJCAI, pages 3155--3161, 2018.
[59]
V. W. Zheng, M. Sha, Y. Li, H. Yang, Z. Zhang, and K.-L. Tan. Heterogeneous embedding propagation for large-scale e-commerce user alignment. In ICDM, 2018.
[60]
L. Zhou, Y. Yang, X. Ren, F. Wu, and Y. Zhuang. Dynamic network embedding by modeling triadic closure process. In AAAI, 2018.
[61]
Zhao, Y., Wang, X., Yang, H., Song, L. and Tang, J. Large Scale Evolving Graphs with Burst Detection. In IJCAI, 2019.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 12, Issue 12
August 2019
547 pages

Publisher

VLDB Endowment

Publication History

Published: 01 August 2019
Published in PVLDB Volume 12, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)3
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media