skip to main content
10.1145/3576915.3623166acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
research-article
Open access

PyRTFuzz: Detecting Bugs in Python Runtimes via Two-Level Collaborative Fuzzing

Published: 21 November 2023 Publication History

Abstract

Given the widespread use of Python and its sustaining impact, the security and reliability of the Python runtime system is highly and broadly critical. Yet with real-world bugs in Python runtimes being continuously and increasingly reported, technique/tool support for automated detection of such bugs is still largely lacking. In this paper, we present PyRTFuzz, a novel fuzzing technique/tool for holistically testing Python runtimes including the language interpreter and its runtime libraries. PyRTFuzz combines generationand mutation-based fuzzing at the compiler- and application-testing level, respectively, as enabled by static/dynamic analysis for extracting runtime API descriptions, a declarative, specification language for valid and diverse Python code generation, and a custom type-guided mutation strategy for format/structure-aware application input generation. We implemented PyRTFuzz for the primary Python implementation (CPython) and applied it to three versions of the runtime. Our experiments revealed 61 new, demonstrably exploitable bugs including those in the interpreter and most in the runtime libraries. Our results also demonstrated the promising scalability and cost-effectiveness of PyRTFuzz and its great potential for further bug discovery. The two-level collaborative fuzzing methodology instantiated in PyRTFuzz may also apply to other language runtimes especially those of interpreted languages.

References

[1]
Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In NDSS, Vol. 19. 1--15.
[2]
Franco Bazzichi and Ippolito Spadafora. 1982. An automatic generator for compiler testing. IEEE Transactions on Software Engineering 4 (1982), 343--353.
[3]
Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2017. Coverage-based greybox fuzzing as markov chain. IEEE Transactions on Software Engineering, Vol. 45, 5 (2017), 489--506.
[4]
Haipeng Cai and Xiaoqin Fu. 2021. D2ABS: A Framework for Dynamic Dependence Analysis of Distributed Programs. IEEE Transactions on Software Engineering (TSE), Vol. 48, 12 (2021), 4733--4761. https://doi.org/10.1109/TSE.2021.3124795 (impact factor: 6.226).
[5]
Junjie Chen, Jibesh Patra, Michael Pradel, Yingfei Xiong, Hongyu Zhang, Dan Hao, and Lu Zhang. 2020. A survey of compiler testing. ACM Computing Surveys (CSUR), Vol. 53, 1 (2020), --36.
[6]
Yuanliang Chen, Yu Jiang, Fuchen Ma, Jie Liang, Mingzhe Wang, Chijin Zhou, Xun Jiao, and Zhuo Su. 2019a. EnFuzz: Ensemble Fuzzing with Seed Synchronization among Diverse Fuzzers. In USENIX Security Symposium. 1967--1983.
[7]
Yuting Chen, Ting Su, and Zhendong Su. 2019b. Deep differential testing of JVM implementations. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 1257--1268.
[8]
Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed differential testing of JVM implementations. In proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation. 85--99.
[9]
Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler fuzzing through deep learning. In Proceedings of the 27th ACM SIGSOFT International Symposium on Software Testing and Analysis. 95--105.
[10]
Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. AFL: Combining incremental steps of fuzzing research. In 14th {USENIX} Workshop on Offensive Technologies ({WOOT} 20).
[11]
Xiaoqin Fu and Haipeng Cai. 2021. FlowDist:Multi-Staged Refinement-Based Dynamic Information Flow Analysis for Distributed Software Systems. In 30th USENIX Security Symposium (USENIX Security 21). 2093--2110.
[12]
Xiaoqin Fu, Haipeng Cai, Wen Li, and Li LI. 2020. Seads: Scalable and Cost-Effective Dynamic Dependence Analysis of Distributed Systems via Reinforcement Learning. ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 30, 1 (2020), 1--45. https://doi.org/10.1145/3379345 (impact factor 2.5; journal-first paper).
[13]
Xiaoqin Fu, Boxiang Lin, and Haipeng Cai. 2022. DistFax: A Toolkit for Measuring Interprocess Communications and Quality of Distributed Systems. In IEEE/ACM International Conference on Software Engineering (ICSE), Tool Demos. 51--55. https://doi.org/10.1145/3510454.3516859
[14]
google. 2022. A Coverage-Guided, Native Python Fuzzer. https://github.com/google/atheris.
[15]
Samuel Groß. 2018. Fuzzil: Coverage guided fuzzing for javascript engines. Department of Informatics, Karlsruhe Institute of Technology (2018).
[16]
Emre Güler, Philipp Görz, Elia Geretto, Andrea Jemmett, Sebastian Österlund, Herbert Bos, Cristiano Giuffrida, and Thorsten Holz. 2020. Cupid: Automatic fuzzer selection for collaborative fuzzing. In Annual Computer Security Applications Conference. 360--372.
[17]
HyungSeok Han, DongHyeon Oh, and Sang Kil Cha. 2019. CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in JavaScript Engines. In NDSS.
[18]
Kenneth V. Hanford. 1970. Automatic generation of test cases. IBM Systems Journal, Vol. 9, 4 (1970), 242--257.
[19]
Mostafa Hassan, Caterina Urban, Marco Eilers, and Peter Müller. 2018. MaxSMT-based type inference for Python 3. In Computer Aided Verification: 30th International Conference, CAV 2018, Held as Part of the Federated Logic Conference, FloC 2018, Oxford, UK, July 14-17, 2018, Proceedings, Part II 30. Springer, 12--19.
[20]
Christian Holler, Kim Herzig, Andreas Zeller, et al. 2012. Fuzzing with Code Fragments. In USENIX Security Symposium. 445--458.
[21]
Vanshika kakkar. 2023. Top 10 Programming Languages to Learn in 2023. https://www.geeksforgeeks.org/top-10-programming-languages-to-learn/
[22]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. ACM Sigplan Notices, Vol. 49, 6 (2014), 216--226.
[23]
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding deep compiler bugs via guided stochastic program mutation. ACM SIGPLAN Notices, Vol. 50, 10 (2015), 386--399.
[24]
Wen Li, Li LI, and Haipeng Cai. 2022a. PolyFax: A Toolkit for Characterizing Multi-Language Software. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Tool Demos. 1662--1666. https://doi.org/10.1145/3540250.3558925
[25]
Wen Li, Austin Marino, Haoran Yang, Na Meng, Li Li, and Haipeng Cai. 2023 a. How are Multilingual Systems Constructed: Characterizing Language Use and Selection in Open-Source Multilingual Software. ACM Transactions on Software Engineering and Methodology (TOSEM) (2023).
[26]
Wen Li, Na Meng, Li Li, and Haipeng Cai. 2021. Understanding language selection in multi-language software projects on GitHub. In IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings. 256--257.
[27]
Wen Li, Jiang Ming, Xiapu Luo, and Haipeng Cai. 2022b. {PolyCruise}: A {Cross-Language} Dynamic Information Flow Analysis. In 31st USENIX Security Symposium (USENIX Security 22). 2513--2530.
[28]
Wen Li, Jinyang Ruan, Guangbei Yi, Long Cheng, Xiapu Luo, and Haipeng Cai. 2023 b. PolyFuzz: Holistic Greybox Fuzzing of Multi-Language Systems. In 32nd USENIX Security Symposium (USENIX Security 23). 1379--1396. https://www.usenix.org/conference/usenixsecurity23/presentation/li-wen
[29]
Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F Donaldson. 2015. Many-core compiler fuzzing. ACM SIGPLAN Notices, Vol. 50, 6 (2015), 65--76.
[30]
LLVM. 2020. LibFuzzer: A library for coverage-guided fuzz testing. https://llvm.org/docs/LibFuzzer.html.
[31]
Valentin JM Manes, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J Schwartz, and Maverick Woo. 2018. Fuzzing: Art, science, and engineering. arXiv preprint arXiv:1812.00140 (2018).
[32]
Amir M Mir, Evaldas Latovs kinas, Sebastian Proksch, and Georgios Gousios. 2022. Type4Py: Practical deep similarity learning-based type inference for Python. In Proceedings of the 44th International Conference on Software Engineering. 2241--2252.
[33]
M.Zalewski. 2014. Technical "whitepaper" for afl-fuzz. https://lcamtuf.coredump.cx/afl/technical_details.txt.
[34]
Eriko Nagai, Hironobu Awazu, Nagisa Ishiura, and Naoya Takeda. 2012. Random testing of C compilers targeting arithmetic optimization. In Workshop on Synthesis And System Integration of Mixed Information Technologies (SASIMI 2012). 48--53.
[35]
Eriko Nagai, Atsushi Hashimoto, and Nagisa Ishiura. 2014. Reinforcing random testing of arithmetic optimization of C compilers by scaling up size and number of expressions. IPSJ Transactions on System LSI Design Methodology, Vol. 7 (2014), 91--100.
[36]
Sebastian Österlund, Elia Geretto, Andrea Jemmett, Emre Güler, Philipp Görz, Thorsten Holz, Cristiano Giuffrida, and Herbert Bos. 2021. Collabfuzz: A framework for collaborative fuzzing. In Proceedings of the 14th European Workshop on Systems Security. 1--7.
[37]
Jibesh Patra and Michael Pradel. 2016. Learning to fuzz: Application-independent fuzz testing with probabilistic, generative models of input data. TU Darmstadt, Department of Computer Science, Tech. Rep. TUD-CS-2016--14664 (2016).
[38]
Yun Peng, Cuiyun Gao, Zongjie Li, Bowei Gao, David Lo, Qirun Zhang, and Michael Lyu. 2022. Static inference meets deep learning: a hybrid type inference approach for python. In Proceedings of the 44th International Conference on Software Engineering. 2019--2030.
[39]
Paul Purdom. 1972. A sentence generator for testing parsers. BIT Numerical Mathematics, Vol. 12 (1972), 366--375.
[40]
Victoria Puzhevich. 2020. Top Programming Languages to Use. https://scand.com/company/blog/top-programming-languages-to-use-in-2020/
[41]
Python. 2022a. CPython Repository. https://github.com/python/cpython.
[42]
Python. 2022b. Python 3.8 Abstract Syntax Trees. https://docs.python.org/3.8/library/ast.html.
[43]
Python.org. 2023. The Python Language Reference. https://docs.python.org/3/reference/.
[44]
Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing. In NDSS, Vol. 17. 1--14.
[45]
Jesse Ruderman. 2007. Introducing jsfunfuzz. URL http://www. squarefree. com/2007/08/02/introducing-jsfunfuzz, Vol. 20 (2007), 25--29.
[46]
Dipanjan Sarkar, Raghav Bali, and Tushar Sharma. 2018. Practical machine learning with Python. A Problem-Solvers Guide To Building Real-World Intelligent Systems. Berkely: Apress (2018).
[47]
Emin Gün Sirer and Brian N Bershad. 1999. Using production grammars in software testing. ACM SIGPLAN Notices, Vol. 35, 1 (1999), 1--13.
[48]
Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2017. Skyfire: Data-driven seed generation for fuzzing. In 2017 IEEE Symposium on Security and Privacy (SP). IEEE, 579--594.
[49]
Junjie Wang, Bihuan Chen, Lei Wei, and Yang Liu. 2019. Superion: Grammar-aware greybox fuzzing. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). IEEE, 724--735.
[50]
Zhaogui Xu, Peng Liu, Xiangyu Zhang, and Baowen Xu. 2016. Python predictive analysis for bug detection. In Proceedings of the 2016 24th ACM SIGSOFT international symposium on foundations of software engineering. 121--132.
[51]
Haoran Yang, Wen Li, and Haipeng Cai. 2022. Language-Agnostic Dynamic Analysis of Multilingual Code: Promises, Pitfalls, and Prospects. In ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Ideas, Visions and Reflections. 1621--1626. https://doi.org/10.1145/3540250.3560880
[52]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN conference on Programming language design and implementation. 283--294.

Cited By

View all
  • (2024)Towards Robust Detection of Open Source Software Supply Chain Poisoning Attacks in Industry EnvironmentsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695262(1990-2001)Online publication date: 27-Oct-2024
  • (2024)WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language ModelsProceedings of the ACM on Programming Languages10.1145/36897368:OOPSLA2(709-735)Online publication date: 8-Oct-2024
  • (2024)Towards More Complete Constraints for Deep Learning Library Testing via Complementary Set Guided RefinementProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680364(1338-1350)Online publication date: 11-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CCS '23: Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
November 2023
3722 pages
ISBN:9798400700507
DOI:10.1145/3576915
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 November 2023

Check for updates

Author Tags

  1. code generation
  2. collaborative fuzzing
  3. fuzz testing
  4. greybox fuzzing
  5. language runtime
  6. python
  7. runtime system
  8. software security

Qualifiers

  • Research-article

Funding Sources

Conference

CCS '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,261 of 6,999 submissions, 18%

Upcoming Conference

CCS '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,140
  • Downloads (Last 6 weeks)135
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Robust Detection of Open Source Software Supply Chain Poisoning Attacks in Industry EnvironmentsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695262(1990-2001)Online publication date: 27-Oct-2024
  • (2024)WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language ModelsProceedings of the ACM on Programming Languages10.1145/36897368:OOPSLA2(709-735)Online publication date: 8-Oct-2024
  • (2024)Towards More Complete Constraints for Deep Learning Library Testing via Complementary Set Guided RefinementProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680364(1338-1350)Online publication date: 11-Sep-2024
  • (2024)Cross-Language Differential Testing of JSON ParsersProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3657003(1117-1127)Online publication date: 1-Jul-2024
  • (2024)VGX: Large-Scale Sample Generation for Boosting Learning-Based Software Vulnerability AnalysesProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639116(1-13)Online publication date: 20-May-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media