Article

An Empirical Study on Bugs Inside TensorFlow

Authors:

Xuansheng LuAuthors Info & Claims

Database Systems for Advanced Applications: 25th International Conference, DASFAA 2020, Jeju, South Korea, September 24–27, 2020, Proceedings, Part I

Pages 604 - 620

https://doi.org/10.1007/978-3-030-59410-7_40

Published: 24 September 2020 Publication History

Abstract

In recent years, deep learning has become a hot research topic. Although it achieves incredible positive results in some scenarios, bugs inside deep learning software can introduce disastrous consequences, especially when the software is used in safety-critical applications. To understand the bug characteristic of deep learning software, researchers have conducted several empirical studies on deep learning bugs. Although these studies present useful findings, we notice that none of them analyze the bug characteristic inside a deep learning library like TensorFlow. We argue that some fundamental questions of bugs in deep learning libraries are still open. For example, what are the symptoms and the root causes of bugs inside TensorFlow, and where are they? As the underlying library of many deep learning projects, the answers to these questions are useful and important, since its bugs can have impacts on many deep learning projects. In this paper, we conduct the first empirical study to analyze the bugs inside a typical deep learning library, i.e., TensorFlow. Based on our results, we summarize 5 findings, and present our answers to 2 research questions. For example, we find that the symptoms and root causes of TensorFlow bugs are more like ordinary projects (e.g., Mozilla) than other machine learning libraries (e.g., Lucene). As another example, we find that most TensorFlow bugs reside in its interfaces (26.24%), learning algorithms (11.79%), and how to compile (8.02%), deploy (7.55%), and install (4.72%) TensorFlow across platforms.

References

[1]

Fix deadlocks in staging areas (2017). https://github.com/tensorflow/tensorflow/pull/13684

[2]

Bug in tf.print summarized formatting (2018). https://github.com/tensorflow/tensorflow/issues/20751

[3]

Cannot opened include file “tensorflow/contrib/tpu/proto/tpuembeddingconfig. pb.h”: no such file or directory (2018). https://github.com/tensorflow/tensorflow/issues/16262

[4]

Exception when not providing optional parameter frequencyskip in timefreqlstmcell (2018). https://github.com/tensorflow/tensorflow/issues/16100

[5]

Fix an imperfect implementation of tf.losses.meanpairwisesquarederror (2018). https://github.com/tensorflow/tensorflow/pull/16433

[6]

Fix broken python3 build (2018). https://github.com/tensorflow/tensorflow/pull/16130

[7]

Fix build issue with KafkaDataset (2018). https://github.com/tensorflow/tensorflow/pull/17418

[8]

Fix error: ConvNDLSTMCell does not pass name parameter (2018). https://github.com/tensorflow/tensorflow/pull/17345

[9]

Fix possible memory leak (2018). https://github.com/tensorflow/tensorflow/pull/21950

[10]

Fix routing of quantized tensors (2018). https://github.com/tensorflow/tensorflow/pull/19894

[11]

Fix tf.argmax warnings on dimension argument by using axis instead (2018). https://github.com/tensorflow/tensorflow/pull/18558

[12]

Fix var type issue which breaks crfdecode (2018). https://github.com/tensorflow/tensorflow/pull/21371

[13]

Fixed build error on gcc-7 (2018). https://github.com/tensorflow/tensorflow/pull/21017

[14]

[INTEL MKL] fix bug in MklSlice op when allocating output tensor (2018). https://github.com/tensorflow/tensorflow/pull/22822

[15]

Max pooling cause error on empty batch (2018). https://github.com/tensorflow/tensorflow/issues/21338

[16]

MKL DNN: fix the TF1.6 speed issue by fixing MKL DNN LRN taking the optimum path (2018). https://github.com/tensorflow/tensorflow/pull/17605

[17]

tfcnnbenchmarks.py stuck when running with multiple GPUs and ImageNet data with protocol grpc+verbs (2018). https://github.com/tensorflow/tensorflow/issues/11725

[18]

Fix for stringPiece build failure (2019). https://github.com/tensorflow/tensorflow/pull/21956

[19]

Keras (2019). https://keras.io

[20]

Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. In: Proceedings of OSDI, pp. 265–283 (2016)

[21]

Anand, S.S., Bell, D.A., Hughes, J.G.: An empirical performance study of the Ingres search accelerator for a large property management database system. In: Proceedings of VLDB, pp. 676–685 (1994)

[22]

Avizienis A, Laprie J, Randell B, and Landwehr CE Basic concepts and taxonomy of dependable and secure computing IEEE Trans. Dependable Sec. Comput. 2004 1 1 11-33

Digital Library

[23]

Bengio Y, Ducharme R, Vincent P, and Janvin C A neural probabilistic language model J. Mach. Learn. Res. 2003 3 1137-1155

Digital Library

[24]

Bergstra, J., et al.: Theano: deep learning on GPUs with Python. In: Proceedings of the NIPS, BigLearning Workshop (2011)

[25]

Collobert, R., Bengio, S., Marithoz, J.: Torch: a modular machine learning software library (2002)

[26]

Derr, E., Bugiel, S., Fahl, S., Acar, Y., Backes, M.: Keep me updated: an empirical study of third-party library updatability on android. In: Proceedings of the CCS, pp. 2187–2200 (2017)

[27]

Endres A An analysis of errors and their causes in system programs IEEE Trans. Software Eng. 1975 1 2 140-149

Digital Library

[28]

Florêncio, D.A.F., Herley, C.: A large-scale study of web password habits. In: Proceedings of the WWW, pp. 657–666 (2007)

[29]

Glass RL Persistent software errors IEEE Trans. Software Eng. 1981 7 2 162-168

Digital Library

[30]

Han J, Kamber M, and Pei J Data Mining: Concepts and Techniques 2011 Burlington Morgan Kaufmann Publishers

Digital Library

[31]

Hinton G et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups IEEE Signal Process. Mag. 2012 29 6 82-97

[32]

Humbatova, N., Jahangirova, G., Bavota, G., Riccio, V., Stocco, A., Tonella, P.: Taxonomy of real faults in deep learning systems. In: Proceedings of the ICSE (2020, to appear)

[33]

Islam, M.J., Nguyen, G., Pan, R., Rajan, H.: A comprehensive study on deep learning bug characteristics. In: Proceedings of the ESEC/FSE, pp. 510–520 (2019)

[34]

Jia, Y., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the MM, pp. 675–678 (2014)

[35]

Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: Proceedings of the ICLR (2017)

[36]

Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Proceedings of the NIPS, pp. 1106–1114 (2012)

[37]

Li G, Zhou X, Li S, and Gao B Qtune: a query-aware database tuning system with deep reinforcement learning PVLDB 2019 12 12 2118-2130

Digital Library

[38]

Lin, Q., Chen, G., Zhang, M.: On the design of adaptive and speculative concurrency control in distributed databases. In: Proceedings of the ICDE, pp. 1376–1379 (2018)

[39]

Lockemann PC, Nagel H, and Walter IM Databases for knowledge bases: empirical study of a knowledge base management system for a semantic network Data Knowl. Eng. 1991 7 115-154

Digital Library

[40]

Ma, L., et al.: DeepGauge: multi-granularity testing criteria for deep learning systems. In: Proceedings of the ASE, pp. 120–131 (2018)

[41]

Pei, K., Cao, Y., Yang, J., Jana, S.: DeepXplore: automated whitebox testing of deep learning systems. In: Proceedings of the SOSP, pp. 1–18 (2017)

[42]

Pham, H.V., Lutellier, T., Qi, W., Tan, L.: CRADLE: cross-backend validation to detect and localize bugs in deep learning libraries. In: Proceedings of the ICSE, pp. 1027–1038 (2019)

[43]

Ren K, Thomson A, and Abadi DJ VLL: a lock manager redesign for main memory database systems VLDB J. 2015 24 5 681-705

Digital Library

[44]

van Renen, A., et al.: Managing non-volatile memory in database systems. In: Proceedings of the SIGMOD, pp. 1541–1555 (2018)

[45]

Sutskever, I., Vinyals, O., Le, Q.V.: Sequence to sequence learning with neural networks. In: Proceedings of the NIPS, pp. 3104–3112 (2014)

[46]

Tan L, Liu C, Li Z, Wang X, Zhou Y, and Zhai C Bug characteristics in open source software Empirical Softw. Eng. 2014 19 6 1665-1705

Digital Library

[47]

Thung, F., Wang, S., Lo, D., Jiang, L.: An empirical study of bugs in machine learning systems. In: Proceedings of the ISSRE, pp. 271–280 (2012)

[48]

Tian, Y., Pei, K., Jana, S., Ray, B.: DeepTest: automated testing of deep-neural-network-driven autonomous cars. In: Proceedings of the ICSE, pp. 303–314 (2018)

[49]

Wang, L., et al.: Superneurons: dynamic GPU memory management for training deep neural networks. In: Proceedings of the PPoPP, pp. 41–53 (2018)

[50]

Wang, S., Liu, T., Nam, J., Tan, L.: Deep semantic feature learning for software defect prediction. IEEE Trans. Softw. Eng., 1 (2018, early access).

[51]

Wang W, Zhang M, Chen G, Jagadish HV, Ooi BC, and Tan K Database meets deep learning: challenges and opportunities SIGMOD Record 2016 45 2 17-22

Digital Library

[52]

Xu B et al. NADAQ: natural language database querying based on deep learning IEEE Access 2019 7 35012-35017

[53]

Zhang, Y., Chen, Y., Cheung, S., Xiong, Y., Zhang, L.: An empirical study on TensorFlow program bugs. In: Proceedings of the ISSTA, pp. 129–140 (2018)

Cited By

Zou YZhai JFang CLiu JZheng TChen ZFilkov VRay BZhou M(2024)Mutation-Based Deep Learning Framework Testing Method in JavaScript EnvironmentProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695478(970-981)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695478
Guan HBai GLiu YChristakis MPradel M(2024)Large Language Models Can Connect the Dots: Exploring Model Optimization Bugs with Domain Knowledge-Aware PromptsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680383(1579-1591)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680383
Ma HZhang WShen QTian YChen JCheung SChristakis MPradel M(2024)Towards Understanding the Bugs in Solidity CompilerProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680362(1312-1324)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680362
Show More Cited By

Recommendations

An empirical study on TensorFlow program bugs
ISSTA 2018: Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis

Deep learning applications become increasingly popular in important domains such as self-driving systems and facial identity systems. Defective deep learning applications may lead to catastrophic consequences. Although recent research efforts were made ...
An empirical study of bugs in test code
ICSME '15: Proceedings of the 2015 IEEE International Conference on Software Maintenance and Evolution (ICSME)

Testing aims at detecting (regression) bugs in production code. However, testing code is just as likely to contain bugs as the code it tests. Buggy test cases can silently miss bugs in the production code or loudly ring false alarms when the production ...
Silent bugs in deep learning frameworks: an empirical study of Keras and TensorFlow
Abstract
Deep Learning (DL) frameworks are now widely used, simplifying the creation of complex models as well as their integration into various applications even among non-DL experts. However, like any other programs, they are prone to bugs. This paper ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Database Systems for Advanced Applications: 25th International Conference, DASFAA 2020, Jeju, South Korea, September 24–27, 2020, Proceedings, Part I

Sep 2020

837 pages

ISBN:978-3-030-59409-1

DOI:10.1007/978-3-030-59410-7

Editors:
Yunmook Nah
Dankook University, Yongin, Korea (Republic of)
,
Bin Cui
Peking University, Haidian, China
,
Sang-Won Lee
Sungkyunkwan University, Suwon, Korea (Republic of)
,
Jeffrey Xu Yu
Department of System Engineering and Engineering Management, The Chinese University of Hong Kong, Hong Kong, Hong Kong
,
Yang-Sae Moon
Kangwon National University, Chunchon, Korea (Republic of)
,
Steven Euijong Whang
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)

© Springer Nature Switzerland AG 2020.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 24 September 2020

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

14
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zou YZhai JFang CLiu JZheng TChen ZFilkov VRay BZhou M(2024)Mutation-Based Deep Learning Framework Testing Method in JavaScript EnvironmentProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695478(970-981)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695478
Guan HBai GLiu YChristakis MPradel M(2024)Large Language Models Can Connect the Dots: Exploring Model Optimization Bugs with Domain Knowledge-Aware PromptsProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680383(1579-1591)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680383
Ma HZhang WShen QTian YChen JCheung SChristakis MPradel M(2024)Towards Understanding the Bugs in Solidity CompilerProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680362(1312-1324)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680362
Gao KHe RXie BZhou M(2024)Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and DisengagementACM Transactions on Software Engineering and Methodology10.1145/364033633:4(1-27)Online publication date: 10-Jan-2024
https://dl.acm.org/doi/10.1145/3640336
Jiang JYang JZhang YWang ZYou HChen J(2024)A Post-training Framework for Improving the Performance of Deep Learning Models via Model TransformationACM Transactions on Software Engineering and Methodology10.1145/363001133:3(1-41)Online publication date: 15-Mar-2024
https://dl.acm.org/doi/10.1145/3630011
Gao YHe YLi XZhao BLin HLiang YZhong JZhang HWang JZeng YGui KTong JYang MRoychoudhury APaiva AAbreu RStorey M(2024)An Empirical Study on Low GPU Utilization of Deep Learning JobsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639232(1-13)Online publication date: 20-May-2024
https://dl.acm.org/doi/10.1145/3597503.3639232
Liu JHuang YWang ZMa LFang CGu MZhang XChen Z(2023)Generation-based Differential Fuzzing for Deep Learning LibrariesACM Transactions on Software Engineering and Methodology10.1145/362815933:2(1-28)Online publication date: 23-Dec-2023
https://dl.acm.org/doi/10.1145/3628159
Mo RZhang YWang YZhang SXiong PLi ZZhao Y(2023)Exploring the Impact of Code Clones on Deep Learning SoftwareACM Transactions on Software Engineering and Methodology10.1145/360718132:6(1-34)Online publication date: 28-Sep-2023
https://dl.acm.org/doi/10.1145/3607181
Chen JLiang YShen QJiang JLi S(2023)Toward Understanding Deep Learning Framework BugsACM Transactions on Software Engineering and Methodology10.1145/358715532:6(1-31)Online publication date: 29-Sep-2023
https://dl.acm.org/doi/10.1145/3587155
Quan LGuo QXie XChen SLi XLiu Y(2022)Towards Understanding the Faults of JavaScript-Based Deep Learning SystemsProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering10.1145/3551349.3560427(1-13)Online publication date: 10-Oct-2022
https://dl.acm.org/doi/10.1145/3551349.3560427
Show More Cited By

View Options

View options

Media

Figures

Other

Tables

View Table of Contents