skip to main content
research-article
Open access

Well-typed programs can go wrong: a study of typing-related bugs in JVM compilers

Published: 15 October 2021 Publication History

Abstract

Despite the substantial progress in compiler testing, research endeavors have mainly focused on detecting compiler crashes and subtle miscompilations caused by bugs in the implementation of compiler optimizations. Surprisingly, this growing body of work neglects other compiler components, most notably the front-end. In statically-typed programming languages with rich and expressive type systems and modern features, such as type inference or a mix of object-oriented with functional programming features, the process of static typing in compiler front-ends is complicated by a high-density of bugs. Such bugs can lead to the acceptance of incorrect programs (breaking code portability or the type system's soundness), the rejection of correct (e.g. well-typed) programs, and the reporting of misleading errors and warnings.
We conduct, what is to the best of our knowledge, the first empirical study for understanding and characterizing typing-related compiler bugs. To do so, we manually study 320 typing-related bugs (along with their fixes and test cases) that are randomly sampled from four mainstream JVM languages, namely Java, Scala, Kotlin, and Groovy. We evaluate each bug in terms of several aspects, including their symptom, root cause, bug fix's size, and the characteristics of the bug-revealing test cases. Some representative observations indicate that: (1) more than half of the typing-related bugs manifest as unexpected compile-time errors: the buggy compiler wrongly rejects semantically correct programs, (2) the majority of typing-related bugs lie in the implementations of the underlying type systems and in other core components related to operations on types, (3) parametric polymorphism is the most pervasive feature in the corresponding test cases, (4) one third of typing-related bugs are triggered by non-compilable programs.
We believe that our study opens up a new research direction by driving future researchers to build appropriate methods and techniques for a more holistic testing of compilers.

Supplementary Material

Auxiliary Presentation Video (oopsla21main-p117-p-video.mp4)
This is a presentation video for the OOPSLA 2021 paper titled “Well-Typed Programs Can Go Wrong: A Study of Typing-Related Bugs in JVM Compilers,” accepted in the research track.

References

[1]
Nada Amin, Samuel Grütter, Martin Odersky, Tiark Rompf, and Sandro Stucki. 2016. The Essence of Dependent Object Types. Springer International Publishing, Cham. 249–272. https://doi.org/10.1007/978-3-319-30936-1_14
[2]
Mehdi Bagherzadeh, Nicholas Fireman, Anas Shawesh, and Raffi Khatchadourian. 2020. Actor Concurrency Bugs: A Comprehensive Study on Symptoms, Root Causes, API Usages, and Differences. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 214, Nov., 32 pages. https://doi.org/10.1145/3428282
[3]
Gilad Bracha, Martin Odersky, David Stoutamire, and Philip Wadler. 1998. Making the Future Safe for the Past: Adding Genericity to the Java Programming Language. In Proceedings of the 13th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA ’98). Association for Computing Machinery, New York, NY, USA. 183–200. https://doi.org/10.1145/286936.286957
[4]
Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Bing Xie. 2017. Learning to Prioritize Test Programs for Compiler Testing. In Proceedings of the 39th International Conference on Software Engineering (ICSE ’17). IEEE Press, 700–711. https://doi.org/10.1109/ICSE.2017.70
[5]
Junjie Chen, Y. Bai, D. Hao, Y. Xiong, H. Zhang, L. Zhang, and B. Xie. 2016. Test Case Prioritization for Compilers: A Text-Vector Based Approach. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). 266–277. https://doi.org/10.1109/ICST.2016.19
[6]
Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A Survey of Compiler Testing. ACM Comput. Surv., 53, 1 (2020), Article 4, Feb., 36 pages. issn:0360-0300 https://doi.org/10.1145/3363562
[7]
Yuting Chen, Ting Su, and Zhendong Su. 2019. Deep Differential Testing of JVM Implementations. In Proceedings of the 41st International Conference on Software Engineering (ICSE ’19). IEEE Press, 1257–1268. https://doi.org/10.1109/ICSE.2019.00127
[8]
Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-Directed Differential Testing of JVM Implementations. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’16). Association for Computing Machinery, New York, NY, USA. 85–99. https://doi.org/10.1145/2908080.2908095
[9]
Shafiul Azam Chowdhury, Sohil Lal Shrestha, Taylor T. Johnson, and Christoph Csallner. 2020. SLEMI: Equivalence modulo Input (EMI) Based Mutation of CPS Models for Finding Compiler Bugs in Simulink. In Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering (ICSE ’20). Association for Computing Machinery, New York, NY, USA. 335–346. https://doi.org/10.1145/3377811.3380381
[10]
Kyle Dewey, Jared Roesch, and Ben Hardekopf. 2015. Fuzzing the Rust Typechecker Using CLP. In Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE ’15). IEEE Press, 482–493. https://doi.org/10.1109/ASE.2015.65
[11]
Anthony Di Franco, Hui Guo, and Cindy Rubio-González. 2017. A Comprehensive Study of Real-World Numerical Bug Characteristics. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, 509–519. https://doi.org/10.1109/ASE.2017.8115662
[12]
Alastair F. Donaldson, Hugues Evrard, Andrei Lascu, and Paul Thomson. 2017. Automated Testing of Graphics Shader Compilers. Proc. ACM Program. Lang., 1, OOPSLA (2017), Article 93, Oct., 29 pages. https://doi.org/10.1145/3133917
[13]
Alastair F. Donaldson, Hugues Evrard, and Paul Thomson. 2020. Putting Randomized Compiler Testing into Production (Experience Report). In 34th European Conference on Object-Oriented Programming (ECOOP 2020), Robert Hirschfeld and Tobias Pape (Eds.) (Leibniz International Proceedings in Informatics (LIPIcs), Vol. 166). Schloss Dagstuhl–Leibniz-Zentrum für Informatik, Dagstuhl, Germany. 22:1–22:29. issn:1868-8969 https://doi.org/10.4230/LIPIcs.ECOOP.2020.22
[14]
Saikat Dutta, Owolabi Legunsen, Zixin Huang, and Sasa Misailovic. 2018. Testing Probabilistic Programming Systems. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018). Association for Computing Machinery, New York, NY, USA. 574–586. https://doi.org/10.1145/3236024.3236057
[15]
Github Inc. 2021. The state of the Octoverse. https://octoverse.github.com/ Online accessed; 05-03-2021
[16]
James Gosling, Bill Joy, Guy Steele, Gilad Bracha, and Alex Buckley. 2015. The Java Language Specification: Java SE 8 Edition. https://docs.oracle.com/javase/specs/jls/se8/jls8.pdf
[17]
Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with Code Fragments. In Proceedings of the 21st USENIX Conference on Security Symposium (Security’12). USENIX Association, USA. 38.
[18]
Guoliang Jin, Linhai Song, Xiaoming Shi, Joel Scherpelz, and Shan Lu. 2012. Understanding and Detecting Real-World Performance Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). Association for Computing Machinery, New York, NY, USA. 77–88. https://doi.org/10.1145/2254064.2254075
[19]
Filip Křikava, Heather Miller, and Jan Vitek. 2019. Scala Implicits Are Everywhere: A Large-Scale Study of the Use of Scala Implicits in the Wild. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 163, Oct., 28 pages. https://doi.org/10.1145/3360589
[20]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler Validation via Equivalence modulo Inputs. In Proceedings of the 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’14). Association for Computing Machinery, New York, NY, USA. 216–226. https://doi.org/10.1145/2594291.2594334
[21]
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA 2015). Association for Computing Machinery, New York, NY, USA. 386–399. https://doi.org/10.1145/2814270.2814319
[22]
Tanakorn Leesatapornwongsa, Jeffrey F. Lukman, Shan Lu, and Haryadi S. Gunawi. 2016. TaxDC: A Taxonomy of Non-Deterministic Concurrency Bugs in Datacenter Distributed Systems. In Proceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS ’16). Association for Computing Machinery, New York, NY, USA. 517–530. https://doi.org/10.1145/2872362.2872374
[23]
Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-Core Compiler Fuzzing. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’15). Association for Computing Machinery, New York, NY, USA. 65–76. https://doi.org/10.1145/2737924.2737986
[24]
Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random Testing for C and C++ Compilers with YARPGen. Proc. ACM Program. Lang., 4, OOPSLA (2020), Article 196, Nov., 25 pages. https://doi.org/10.1145/3428264
[25]
M. Zalewski. 2013. American fuzzy lop. https://lcamtuf.coredump.cx/afl/ Online accessed; 05-08-2021
[26]
Michaël Marcozzi, Qiyi Tang, Alastair F. Donaldson, and Cristian Cadar. 2019. Compiler Fuzzing: How Much Does It Matter? Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 155, Oct., 29 pages. https://doi.org/10.1145/3360581
[27]
Luis Mastrangelo, Matthias Hauswirth, and Nathaniel Nystrom. 2019. Casting about in the Dark: An Empirical Study of Cast Operations in Java Programs. Proc. ACM Program. Lang., 3, OOPSLA (2019), Article 158, Oct., 31 pages. https://doi.org/10.1145/3360584
[28]
Bruno Gois Mateus and Matias Martinez. 2020. On the Adoption, Usage and Evolution of Kotlin Features in Android Development. In Proceedings of the 14th ACM / IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (ESEM ’20). Association for Computing Machinery, New York, NY, USA. Article 15, 12 pages. https://doi.org/10.1145/3382494.3410676
[29]
Robin Milner. 1978. A Theory of Type Polymorphism in Programming. J. Comput. System Sci., 17, 3 (1978), 348–375.
[30]
Adriaan Moors, Frank Piessens, and Martin Odersky. 2008. Generics of a Higher Kind. In Proceedings of the 23rd ACM SIGPLAN Conference on Object-Oriented Programming Systems Languages and Applications (OOPSLA ’08). Association for Computing Machinery, New York, NY, USA. 423–438. https://doi.org/10.1145/1449764.1449798
[31]
Eriko Nagai, Hironobu Awazu, Nagisa Ishiura, and Naoya Takeda. 2012. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012). 48–53.
[32]
Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions. IPSJ Transactions on System LSI Design Methodology, 7 (2014), 91–100. https://doi.org/10.2197/ipsjtsldm.7.91
[33]
Martin Odersky, Philippe Altherr, Vincent Cremet, Burak Emir, Sebastian Maneth, Stéphane Micheloud, Nikolay Mihaylov, Michel Schinz, Erik Stenman, and Matthias Zenger. 2004. An overview of the Scala programming language.
[34]
S. Park, W. Xu, I. Yun, D. Jang, and T. Kim. 2020. Fuzzing JavaScript Engines with Aspect-preserving Mutation. In 2020 IEEE Symposium on Security and Privacy (SP). 1629–1642. https://doi.org/10.1109/SP40000.2020.00067
[35]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-Case Reduction for C Compiler Bugs. In Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’12). Association for Computing Machinery, New York, NY, USA. 335–346. https://doi.org/10.1145/2254064.2254104
[36]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding and Analyzing Compiler Warning Defects. In Proceedings of the 38th International Conference on Software Engineering (ICSE ’16). Association for Computing Machinery, New York, NY, USA. 203–213. https://doi.org/10.1145/2884781.2884879
[37]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding Compiler Bugs via Live Code Mutation. OOPSLA 2016. Association for Computing Machinery, New York, NY, USA. 849–863. https://doi.org/10.1145/2983990.2984038
[38]
Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward Understanding Compiler Bugs in GCC and LLVM. In Proceedings of the 25th International Symposium on Software Testing and Analysis (ISSTA 2016). Association for Computing Machinery, New York, NY, USA. 294–305. https://doi.org/10.1145/2931037.2931074
[39]
TIOBE Software BV. 2021. TIOBE index. https://www.tiobe.com/tiobe-index/ Online accessed; 05-03-2021
[40]
Seth Tisue. 2017. Bye bye JIRA — Scala issues migrated to GitHub scala/bug. https://contributors.scala-lang.org/t/bye-bye-jira-scala-issues-migrated-to-github-scala-bug/715
[41]
Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2019. Superion: Grammar-Aware Greybox Fuzzing. In Proceedings of the 41st International Conference on Software Engineering (ICSE ’19). IEEE Press, 724–735. https://doi.org/10.1109/ICSE.2019.00081
[42]
Jie Wang, Wensheng Dou, Yu Gao, Chushu Gao, Feng Qin, Kang Yin, and Jun Wei. 2017. A Comprehensive Study on Real World Concurrency Bugs in Node.js. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE 2017). IEEE Press, 520–531. https://doi.org/10.1109/ASE.2017.8115663
[43]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and Understanding Bugs in C Compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI ’11). Association for Computing Machinery, New York, NY, USA. 283–294. https://doi.org/10.1145/1993498.1993532
[44]
Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal Program Enumeration for Rigorous Compiler Testing. In Proceedings of the 38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2017). Association for Computing Machinery, New York, NY, USA. 347–361. https://doi.org/10.1145/3062341.3062379
[45]
Zhide Zhou, Zhilei Ren, Guojun Gao, and He Jiang. 2021. An empirical study of optimization bugs in GCC and LLVM. Journal of Systems and Software, 174 (2021), 110884. issn:0164-1212 https://doi.org/10.1016/j.jss.2020.110884
[46]
David Zubrow. 2010. IEEE Standard Classification for Software Anomalies. IEEE Std 1044-2009 (Revision of IEEE Std 1044-1993), 1–23. https://doi.org/10.1109/IEEESTD.2010.5399061

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 5, Issue OOPSLA
October 2021
2001 pages
EISSN:2475-1421
DOI:10.1145/3492349
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2021
Published in PACMPL Volume 5, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Groovy
  2. Java
  3. Kotlin
  4. Scala
  5. compiler bugs
  6. compiler testing
  7. static typing

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)534
  • Downloads (Last 6 weeks)58
Reflects downloads up to 25 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media