skip to main content
research-article

Optimizing DNN computation graph using graph substitutions

Published: 01 July 2020 Publication History

Abstract

Deep learning has achieved great success in various real-world applications. As deep neural networks (DNNs) are getting larger, the inference and training cost of DNNs increases significantly. Since one round of inference or one iteration in the training phase of a DNN is typically modeled as a computation graph, existing works propose to optimize computation graphs by performing a sequence of functionally equivalent graph substitutions, leading to higher inference and training efficiency. In this work, we formally define the Optimizing Computation Graph using Graph Substitutions (OCGGS) problem, and prove it to be NP-hard and Poly-APX-complete. We develop two exact and efficient methods to the OCGGS problem. The pruning-based algorithm eliminates the examination of redundant graph substitution sequences, and the dynamic programming with pruning algorithm makes use of the explored graph substitutions. To further speed up the search process, we propose a sampling heuristic which is effective to optimize complex computation graphs with polynomial time and space complexity. Extensive experiments on various DNN architectures and sizes are conducted to verify the effectiveness and efficiency of our proposed solutions compared with existing techniques.

References

[1]
Cuda basic linear algebra subroutine library. https://developer.nvidia.com/cuda-toolkit.
[2]
Neon. https://github.com/NervanaSystems/neon.
[3]
Nvidia tensorrt: Programmable inference accelerator. https://developer.nvidia.com/tensorrt.
[4]
Open single and half precision gemm implementations. https://github.com/openai/openai-gemm.
[5]
T. Chen, T. Moreau, Z. Jiang, L. Zheng, E. Q. Yan, H. Shen, M. Cowan, L. Wang, Y. Hu, L. Ceze, C. Guestrin, and A. Krishnamurthy. TVM: an automated end-to-end optimizing compiler for deep learning. In 13th USENIX Symposium on Operating Systems Design and Implementation, pages 578--594, 2018.
[6]
J. Devlin, M. Chang, K. Lee, and K. Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4171--4186, 2019.
[7]
H. Guo, R. Tang, Y. Ye, Z. Li, and X. He. Deepfm: A factorization-machine based neural network for CTR prediction. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, pages 1725--1731, 2017.
[8]
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770--778, 2016.
[9]
Z. Jia, O. Padon, J. Thomas, T. Warszawski, M. Zaharia, and A. Aiken. Taso: optimizing deep learning computation with automatic generation of graph substitutions. In Proceedings of the 27th ACM Symposium on Operating Systems Principles, pages 47--62, 2019.
[10]
Z. Jia, J. Thomas, T. Warszawski, M. Gao, M. Zaharia, and A. Aiken. Optimizing dnn computation with relaxed graph substitutions. In Proceedings of the 2nd Conference on Systems and Machine Learning, 2019.
[11]
T. Lei, Y. Zhang, and Y. Artzi. Training rnns as fast as cnns. CoRR, abs/1709.02755, 2017.
[12]
U. Pferschy and J. Schauer. The maximum flow problem with disjunctive constraints. Journal of Combinatorial Optimization, 26(1):109--119, 2013.
[13]
M. Sivathanu, T. Chugh, S. S. Singapuram, and L. Zhou. Astra: Exploiting predictability to optimize deep learning. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 909--923, 2019.
[14]
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818--2826, 2016.
[15]
N. Vasilache, O. Zinenko, T. Theodoridis, P. Goyal, Z. DeVito, W. S. Moses, S. Verdoolaege, A. Adams, and A. Cohen. Tensor comprehensions: Framework-agnostic high-performance machine learning abstractions. arXiv preprint arXiv:1802.04730, 2018.
[16]
S. Wu, Y. Tang, Y. Zhu, L. Wang, X. Xie, and T. Tan. Session-based recommendation with graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, pages 346--353, 2019.
[17]
S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492--1500, 2017.
[18]
L. Zheng, V. Noroozi, and P. S. Yu. Joint deep modeling of users and items using reviews for recommendation. In Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, pages 425--434, 2017.
[19]
B. Zoph and Q. V. Le. Neural architecture search with reinforcement learning. In 5th International Conference on Learning Representations, 2017.
[20]
B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 8697--8710, 2018.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 13, Issue 12
August 2020
1710 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 July 2020
Published in PVLDB Volume 13, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)80
  • Downloads (Last 6 weeks)16
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media