research-article

Open access

Understanding and Finding Java Decompiler Bugs

Authors:

Zhendong SuAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 8, Issue OOPSLA1

Article No.: 143, Pages 1380 - 1406

https://doi.org/10.1145/3649860

Published: 29 April 2024 Publication History

PDF eReader

Abstract

Java decompilers are programs that perform the reverse process of Java compilers, i.e., they translate Java bytecode to Java source code. They are essential for reverse engineering purposes and have become more sophisticated and reliable over the years. However, it remains challenging for modern Java decompilers to reliably perform correct decompilation on real-world programs. To shed light on the key challenges of Java decompilation, this paper provides the first systematic study on the characteristics and causes of bugs in mature, widely-used Java decompilers. We conduct the study by investigating 333 unique bugs from three popular Java decompilers. Our key findings and observations include: (1) Although most of the reported bugs were found when decompiling large, real-world code, 40.2% of them have small test cases for bug reproduction; (2) Over 80% of the bugs manifest as exceptions, syntactic errors, or semantic errors, and bugs with source code artifacts are very likely semantic errors; (3) 57.7%, 39.0%, and 41.1% of the bugs respectively are attributed to three stages of decompilers—loading structure entities from bytecode, optimizing these entities, and generating source code from these entities; (4) Bugs in decompilers’ type inference are the most complex to fix; and (5) Region restoration for structures like loop, sugaring for special structures like switch, and type inference of variables of generic types or indistinguishable types are the three most significant challenges in Java decompilation, which to some extent explains our findings in (3) and (4).

Based on these findings, we present JD-Tester, a differential testing framework for Java decompilers, and our experience of using it in testing the three popular Java decompilers. JD-Testerutilizes different Java program generators to construct executable Java tests and finds exceptions, syntactic, and semantic inconsistencies (i.e. bugs) between a generated test and its compiled-decompiled version (through compilation and execution). In total, we have found 62 bugs in the three decompilers, demonstrating both the effectiveness of JD-Tester, and the importance of testing and validating Java decompilers.

References

[1]

Prerna Agrawal and Bhushan Trivedi. 2020. Unstructured data collection from APK files for malware detection. International Journal of Computer Applications, 975 (2020), 8887.

Abstract

References

Cited By

Index Terms

Recommendations

How far we have come: testing decompilation correctness of C decompilers

Finding bugs in java native interface programs

Using a Decompiler for Real-World Source Recovery

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations