skip to main content
research-article
Open access

Understanding and Finding Java Decompiler Bugs

Published: 29 April 2024 Publication History

Abstract

Java decompilers are programs that perform the reverse process of Java compilers, i.e., they translate Java bytecode to Java source code. They are essential for reverse engineering purposes and have become more sophisticated and reliable over the years. However, it remains challenging for modern Java decompilers to reliably perform correct decompilation on real-world programs. To shed light on the key challenges of Java decompilation, this paper provides the first systematic study on the characteristics and causes of bugs in mature, widely-used Java decompilers. We conduct the study by investigating 333 unique bugs from three popular Java decompilers. Our key findings and observations include: (1) Although most of the reported bugs were found when decompiling large, real-world code, 40.2% of them have small test cases for bug reproduction; (2) Over 80% of the bugs manifest as exceptions, syntactic errors, or semantic errors, and bugs with source code artifacts are very likely semantic errors; (3) 57.7%, 39.0%, and 41.1% of the bugs respectively are attributed to three stages of decompilers—loading structure entities from bytecode, optimizing these entities, and generating source code from these entities; (4) Bugs in decompilers’ type inference are the most complex to fix; and (5) Region restoration for structures like loop, sugaring for special structures like switch, and type inference of variables of generic types or indistinguishable types are the three most significant challenges in Java decompilation, which to some extent explains our findings in (3) and (4).
Based on these findings, we present JD-Tester, a differential testing framework for Java decompilers, and our experience of using it in testing the three popular Java decompilers. JD-Testerutilizes different Java program generators to construct executable Java tests and finds exceptions, syntactic, and semantic inconsistencies (i.e. bugs) between a generated test and its compiled-decompiled version (through compilation and execution). In total, we have found 62 bugs in the three decompilers, demonstrating both the effectiveness of JD-Tester, and the importance of testing and validating Java decompilers.

References

[1]
Prerna Agrawal and Bhushan Trivedi. 2020. Unstructured data collection from APK files for malware detection. International Journal of Computer Applications, 975 (2020), 8887.
[2]
Abdulrahman Alzahrani, Hani Alshahrani, Ali Alshehri, and Huirong Fu. 2019. An intelligent behavior-based ransomware detection system for Android platform. In 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). 28–35. https://doi.org/10.1109/TPS-ISA48467.2019.00013
[3]
Vivek Balachandran, Darell JJ Tan, and Vrizlynn LL Thing. 2016. Control flow obfuscation for Android applications. Computers & Security, 61 (2016), 72–93. https://doi.org/10.1016/j.cose.2016.05.003
[4]
William Bonnaventure, Ahmed Khanfir, Alexandre Bartel, Mike Papadakis, and Yves Le Traon. 2021. Confuzzion: A java virtual machine fuzzer for type confusion vulnerabilities. In 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS). 586–597. https://doi.org/10.1109/QRS54544.2021.00069
[5]
David Brumley, JongHyup Lee, Edward J Schwartz, and Maverick Woo. 2013. Native x86 decompilation using $semantics-preserving$ structural analysis and iterative $control-flow$ structuring. In 22nd USENIX Security Symposium (USENIX Security 13). 353–368.
[6]
Stefanos Chaliasos, Thodoris Sotiropoulos, Diomidis Spinellis, Arthur Gervais, Benjamin Livshits, and Dimitris Mitropoulos. 2022. Finding Typing Compiler Bugs. In Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI 2022). Association for Computing Machinery, New York, NY, USA. 183–198. isbn:9781450392655 https://doi.org/10.1145/3519939.3523427
[7]
Sen Chen, Lingling Fan, Chunyang Chen, Ting Su, Wenhe Li, Yang Liu, and Lihua Xu. 2019. Storydroid: Automated generation of storyboard for Android apps. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). 596–607. https://doi.org/10.1109/ICSE.2019.00070
[8]
Sen Chen, Ting Su, Lingling Fan, Guozhu Meng, Minhui Xue, Yang Liu, and Lihua Xu. 2018. Are mobile banking apps secure? what can be improved? In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 797–802. https://doi.org/10.1145/3236024.3275523
[9]
Yuting Chen, Ting Su, and Zhendong Su. 2019. Deep differential testing of JVM implementations. In Proceedings of the 41st International Conference on Software Engineering (ICSE ’19). IEEE Press, 1257–1268. https://doi.org/10.1109/ICSE.2019.00127
[10]
Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed differential testing of JVM implementations. SIGPLAN Notice., 51, 6 (2016), jun, 85–99. issn:0362-1340 https://doi.org/10.1145/2980983.2908095
[11]
Sandeep Dasgupta, Sushant Dinesh, Deepan Venkatesh, Vikram S Adve, and Christopher W Fletcher. 2020. Scalable validation of binary lifters. In Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation. 655–671. https://doi.org/10.1145/3385412.3385964
[12]
Shuaike Dong, Menghao Li, Wenrui Diao, Xiangyu Liu, Jian Liu, Zhou Li, Fenghao Xu, Kai Chen, Xiaofeng Wang, and Kehuan Zhang. 2018. Understanding Android obfuscation techniques: A large-scale investigation in the wild. In International conference on security and privacy in communication systems. 172–192. https://doi.org/10.1007/978-3-030-01701-9_10
[13]
William Enck, Damien Octeau, Patrick D McDaniel, and Swarat Chaudhuri. 2011. A study of Android application security. In USENIX security symposium. 2.
[14]
Maurice H Halstead. 1977. Elements of software science (operating and programming systems series). Elsevier Science Inc.
[15]
James Hamilton and Sebastian Danicic. 2009. An evaluation of current Java bytecode decompilers. In 2009 Ninth IEEE International Working Conference on Source Code Analysis and Manipulation. 129–136. https://doi.org/10.1109/SCAM.2009.24
[16]
Nicolas Harrand, César Soto-Valero, Martin Monperrus, and Benoit Baudry. 2019. The strengths and behavioral quirks of Java bytecode decompilers. In International Working Conference on Source Code Analysis and Manipulation (SCAM). 92–102. https://doi.org/10.1109/SCAM.2019.00019
[17]
Nicolas Harrand, César Soto-Valero, Martin Monperrus, and Benoit Baudry. 2020. Java decompiler diversity and its application to meta-decompilation. Journal of Systems and Software, 168 (2020), 110645. https://doi.org/10.1016/j.jss.2020.110645
[18]
JetBrains. 2023. IntelliJ IDEA: The Capable & Ergonomic Java IDE by JetBrains. https://www.jetbrains.com/idea/ [online, accessed 16-Feb-2023]
[19]
Soomin Kim, Markus Faerevaag, Minkyu Jung, Seungll Jung, DongYeop Oh, JongHyup Lee, and Sang Kil Cha. 2017. Testing intermediate representations for binary analysis. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 353–364. https://doi.org/10.1109/ASE.2017.8115648
[20]
Jozef Kostelanskỳ and L’ubomír Dedera. 2017. An evaluation of output from current Java bytecode decompilers: Is it Android which is responsible for such quality boost? In Communication and Information Technologies (KIT). 1–6. https://doi.org/10.23919/KIT.2017.8109451
[21]
Menghao Li, Wei Wang, Pei Wang, Shuai Wang, Dinghao Wu, Jian Liu, Rui Xue, and Wei Huo. 2017. Libd: Scalable and precise third-party library detection in Android markets. In 2017 IEEE/ACM 39th International Conference on Software Engineering (ICSE). 335–346. https://doi.org/10.1109/ICSE.2017.38
[22]
Zhibo Liu and Shuai Wang. 2020. How far we have come: testing decompilation correctness of C decompilers. In Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis. 475–487. https://doi.org/10.1145/3395363.3397370
[23]
Yifei Lu, Minxue Pan, Juan Zhai, Tian Zhang, and Xuandong Li. 2019. Preference-wise testing for Android applications. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 268–278. https://doi.org/10.1145/3338906.3338980
[24]
Linghui Luo, Felix Pauck, Goran Piskachev, Manuel Benz, Ivan Pashchenko, Martin Mory, Eric Bodden, Ben Hermann, and Fabio Massacci. 2022. TaintBench: Automatic real-world malware benchmarking of Android taint analyses. Empirical Software Engineering, 27, 1 (2022), 1–41. https://doi.org/10.1007/s10664-021-10013-5
[25]
Marcono1234. 2019. Removal of default constructors when decompiling jar breaks method references. https://github.com/leibnitz27/cfr/issues/64 [online, accessed 16-Feb-2023]
[26]
Alejandro Martín, Héctor D Menéndez, and David Camacho. 2017. MOCDroid: multi-objective evolutionary classifier for Android malware detection. Soft Computing, 21, 24 (2017), 7405–7415. https://doi.org/10.1007/s00500-016-2283-y
[27]
Björn Mathis, Vitalii Avdiienko, Ezekiel O Soremekun, Marcel Böhme, and Andreas Zeller. 2017. Detecting information flow by mutating input data. In 2017 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE). 263–273. https://doi.org/10.1109/ASE.2017.8115639
[28]
Noah Mauthe, Ulf Kargén, and Nahid Shahmehri. 2021. A large-scale empirical study of Android app decompilation. In 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). 400–410. https://doi.org/10.1109/SANER50967.2021.00044
[29]
Andrey Yakovlev Mohammad R. Haghighat, Dmitry Khukhro. 2016. Android-art-intel: Java* Fuzzer for Android*. https://github.com/android-art-intel/Fuzzer [online, accessed 16-Feb-2023]
[30]
Andrey Yakovlev Mohammad R. Haghighat, Dmitry Khukhro. 2018. AzulSystems/JavaFuzzer: Java* Fuzzer for Android*. https://github.com/AzulSystems/JavaFuzzer [online, accessed 16-Feb-2023]
[31]
Andrey Yakovlev Mohammad R. Haghighat, Dmitry Khukhro. 2022. 2022 State of the Java Ecosystem Report | New relic*. https://newrelic.com/resources/report/2022-state-of-java-ecosystem [online, accessed 16-Feb-2023]
[32]
Andrey Yakovlev Mohammad R. Haghighat, Dmitry Khukhro. 2023. CodeIntelligenceTesting/jazzer: Coverage-guided, in-process fuzzing for the JVM. https://github.com/CodeIntelligenceTesting/jazzer [online, accessed 16-Feb-2023]
[33]
Abdul Moiz and Manar H Alalfi. 2020. An approach for the identification of information leakage in automotive infotainment systems. In International Working Conference on Source Code Analysis and Manipulation (SCAM). 110–114. https://doi.org/10.1109/SCAM51674.2020.00017
[34]
Nomair A Naeem, Michael Batchelder, and Laurie Hendren. 2007. Metrics for measuring the effectiveness of decompilers and obfuscators. In 15th IEEE International Conference on Program Comprehension (ICPC’07). 253–258. https://doi.org/10.1109/ICPC.2007.27
[35]
Roberto Paleari, Lorenzo Martignoni, Giampaolo Fresi Roglia, and Danilo Bruschi. 2010. N-version disassembly: differential testing of x86 disassemblers. In Proceedings of the 19th international symposium on Software testing and analysis. 265–274. https://doi.org/10.1145/1831708.1831741
[36]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided program reduction. In Proceedings of the 40th International Conference on Software Engineering. 361–371. https://doi.org/10.1145/3180155.3180236
[37]
Robert Tarjan. 1972. Depth-first search and linear graph algorithms. SIAM J. Comput., 1, 2 (1972), 146–160.
[38]
Khaled Yakdan, Sebastian Eschweiler, Elmar Gerhards-Padilla, and Matthew Smith. 2015. No more gotos: Decompilation using pattern-independent control-flow structuring and semantic-preserving transformations. In NDSS.
[39]
Takahide Yoshikawa, Kouya Shimura, and Toshihiro Ozawa. 2003. Random program generator for Java JIT compiler test system. In Third International Conference on Quality Software, 2003. Proceedings. 20–23. https://doi.org/10.1109/QSIC.2003.1319081
[40]
Zhiqiang Zang, Nathan Wiatrek, Milos Gligoric, and August Shi. 2023. Compiler testing using template Java programs. In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE). isbn:9781450394758 https://doi.org/10.1145/3551349.3556958
[41]
Yuhao Zhang, Yifan Chen, Shing-Chi Cheung, Yingfei Xiong, and Lu Zhang. 2018. An empirical study on tensorFlow program bugs. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 129–140. https://doi.org/10.1145/3213846.3213866

Cited By

View all
  • (2024)Rust-twins: Automatic Rust Compiler Testing through Program Mutation and Dual Macros GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695059(631-642)Online publication date: 27-Oct-2024
  • (2024)An Introduction of Test Code Approach in Basic Java Programming Course2024 Seventh International Conference on Vocational Education and Electrical Engineering (ICVEE)10.1109/ICVEE63912.2024.10824018(348-352)Online publication date: 30-Oct-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 8, Issue OOPSLA1
April 2024
1492 pages
EISSN:2475-1421
DOI:10.1145/3554316
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 April 2024
Published in PACMPL Volume 8, Issue OOPSLA1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Decompiler
  2. Differential Testing
  3. Reverse Engineering

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)800
  • Downloads (Last 6 weeks)74
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Rust-twins: Automatic Rust Compiler Testing through Program Mutation and Dual Macros GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695059(631-642)Online publication date: 27-Oct-2024
  • (2024)An Introduction of Test Code Approach in Basic Java Programming Course2024 Seventh International Conference on Vocational Education and Electrical Engineering (ICVEE)10.1109/ICVEE63912.2024.10824018(348-352)Online publication date: 30-Oct-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media