research-article

Make out like a (Multi-Armed) Bandit: Improving the Odds of Fuzzer Seed Scheduling with T-Scheduler

Authors:

Adrian Herrera,

Damith C. Ranasinghe,

Salil S KanhereAuthors Info & Claims

ASIA CCS '24: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security

Pages 1463 - 1479

https://doi.org/10.1145/3634737.3637639

Published: 01 July 2024 Publication History

Abstract

Fuzzing is an industry-standard software testing technique that uncovers bugs in a target program by executing it with mutated inputs. Over the lifecycle of a fuzzing campaign, the fuzzer accumulates inputs inducing new and interesting target behaviors, drawing from these inputs for further mutation and generation of new inputs. This rapidly results in a large pool of inputs to select from, making it challenging to quickly determine the "most promising" input for mutation. Reinforcement learning (RL) provides a natural solution to this seed scheduling problem---a fuzzer can dynamically adapt its selection strategy by learning from past results. However, existing RL approaches are (a) computationally expensive (reducing fuzzer throughput), and/or (b) require hyperparameter tuning (reducing generality across targets and input types). To this end, we propose T-Scheduler, a seed scheduler built upon multi-armed bandit theory to automatically adapt to the target. Notably, our formulation does not require the user to select or tune hyperparameters and is therefore easily generalizable across different targets. We evaluate T-Scheduler over 35 CPU-yr fuzzing effort, comparing it to 11 state-of-the-art schedulers. Our results show that T-Scheduler improves on these 11 schedulers on both bug-finding and coverage-expansion abilities.

References

[1]

Adobe. 1992. TIFF, Revision 6.0. https://www.loc.gov/preservation/digital/formats/fdd/fdd000022.shtml

[2]

Shipra Agrawal and Navin Goyal. 2017. Near-Optimal Regret Bounds for Thompson Sampling. Jorunal of the ACM 64, 5, Article 30 (2017), 24 pages.

Digital Library

[3]

Andrea Arcuri and Lionel Briand. 2011. A Practical Guide for Using Statistical Tests to Assess Randomized Algorithms in Software Engineering. In International Conference on Software Engineering (ICSE). ACM, 1--10.

Digital Library

[4]

Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In Network and Distributed System Security (NDSS). The Internet Society, 15 pages.

[5]

Marcel Böhme, Valentin J. M. Manès, and Sang Kil Cha. 2020. Boosting Fuzzer Efficiency: An Information Theoretic Perspective. In European Software Engineering Conference and Foundations of Software Engineering (ESEC/FSE). ACM, 678--689.

Digital Library

[6]

Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed Greybox Fuzzing. In Computer and Communications Security (CCS). ACM, 2329--2344.

Digital Library

[7]

Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coverage-Based Greybox Fuzzing as Markov Chain. In Computer and Communications Security (CCS). ACM, 1032--1043.

Digital Library

[8]

Marcel Böhme, László Szekeres, and Jonathan Metzman. 2022. On the Reliability of Coverage-Based Fuzzer Benchmarking. In International Conference on Software Engineering (ICSE). ACM, 1621--1633.

Digital Library

[9]

Konstantin Böttinger, Patrice Godefroid, and Rishabh Singh. 2018. Deep Reinforcement Fuzzing. In Security and Privacy Workshops (SPW). IEEE, 116--122.

[10]

Oliver Chang. 2023. Taking the next step: OSS-Fuzz in 2023. https://security.googleblog.com/2023/02/taking-next-step-oss-fuzz-in-2023.html

[11]

Yaohui Chen, Mansour Ahmadi, Reza Mirzazade farkhani, Boyu Wang, and Long Lu. 2020. MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing. In Research in Attacks, Intrusions and Defenses (RAID). USENIX, 77--92.

[12]

Liang Cheng, Yang Zhang, Yi Zhang, Chen Wu, Zhangtan Li, Yu Fu, and Haisheng Li. 2019. Optimizing Seed Inputs in Fuzzing with Machine Learning. In International Conference on Software Engineering: Companion (ICSE). IEEE, 244--245.

Digital Library

[13]

Clang Team. 2022. Source-based Code Coverage. https://clang.llvm.org/docs/SourceBasedCodeCoverage.html

[14]

Jacob Cohen. 1960. A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement 20, 1 (1960), 37--46.

[15]

Andrea Fioraldi, Dominik Maier, Heiko Eißfeldt, and Marc Heuse. 2020. AFL++: Combining Incremental Steps of Fuzzing Research. In Workshop on Offensive Technologies (WOOT). USENIX, 12 pages.

[16]

Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&Fuzz: Machine Learning for Input Fuzzing. In Automated Software Engineering (ASE). IEEE, 50--59.

[17]

Ahmad Hazimeh, Adrian Herrera, and Mathias Payer. 2020. Magma: A Ground-Truth Fuzzing Benchmark. Measurement and Analysis of Computing Systems 4, 3, Article 49 (2020), 29 pages.

Digital Library

[18]

Adrian Herrera, Hendra Gunadi, Shane Magrath, Michael Norrish, Mathias Payer, and Antony L. Hosking. 2021. Seed Selection for Successful Fuzzing. In International Symposium on Software Testing and Analysis (ISSTA). ACM, 230--243.

Digital Library

[19]

Adrian Herrera, Mathias Payer, and Antony L. Hosking. 2022. Registered Report: datAFLow Towards a Data-Flow-Guided Fuzzer. In Fuzzing Workshop (FUZZING). The Internet Society, 11 pages.

[20]

Siddharth Karamcheti, Gideon Mann, and David Rosenberg. 2018. Adaptive Grey-Box Fuzz-Testing with Thompson Sampling. In Artificial Intelligence and Security (AISec). ACM, 37--47.

Digital Library

[21]

Leo Katz. 1953. A new status index derived from sociometric analysis. Psychometrika (1953), 39--43.

[22]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating Fuzz Testing. In Computer and Communications Security (CCS). ACM, 2123--2138.

Digital Library

[23]

Yuki Koike, Hiroyuki Katsura, Hiromu Yakura, and Yuma Kurogome. 2022. SLOPT: Bandit Optimization Framework for Mutation-Based Fuzzing. In Proceedings of the 38th Annual Computer Security Applications Conference (ACSAC). ACM, 519--533.

Digital Library

[24]

Tor Lattimore and Csaba Szepesvári. 2020. Bandit algorithms. Cambridge University Press.

[25]

Siqi Li, Xiaofei Xie, Yun Lin, Yuekang Li, Ruitao Feng, Xiaohong Li, Weimin Ge, and Jin Song Dong. 2022. Deep Learning for Coverage-Guided Fuzzing: How Far are We? Transactions on Dependable and Secure Computing (2022), 1--13.

[26]

Nathan Mantel. 1966. Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemotherapy Reports 50, 3 (1966), 163--170.

[27]

Valentin J.M. Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, and Maverick Woo. 2021. The Art, Science, and Engineering of Fuzzing: A Survey. Transactions on Software Engineering 47, 11 (2021), 2312--2331.

[28]

Jonathan Metzman, László Szekeres, Laurent Maurice Romain Simon, Read Trevelin Sprabery, and Abhishek Arya. 2021. FuzzBench: An Open Fuzzer Benchmarking Platform and Service. In European Software Engineering Conference and Foundations of Software Engineering (ESEC/FSE). ACM, 1393--1403.

Digital Library

[29]

NIST. 2019. CVE-2019-7663. https://nvd.nist.gov/vuln/detail/CVE-2019-7663

[30]

Alexandre Rebert, Sang Kil Cha, Thanassis Avgerinos, Jonathan Foote, David Warren, Gustavo Grieco, and David Brumley. 2014. Optimizing Seed Selection for Fuzzing. In USENIX Security (SEC). USENIX, 861--875.

Digital Library

[31]

Gary J. Saavedra, Kathryn N. Rodhouse, Daniel M. Dunlavy, and W. Philip Kegelmeyer. 2019. A Review of Machine Learning Applications in Fuzzing. arXiv Preprint abs/1906.11133 (2019), 12 pages.

[32]

Joseph Scott, Federico Mora, and Vijay Ganesh. 2020. BanditFuzz: A Reinforcement-Learning Based Performance Fuzzer for SMT Solvers. In Verified Software: Theories, Tools, and Experiments (VSTTE). Springer-Verlag, 68--86.

Digital Library

[33]

Yevgeny Seldin, Csaba Szepesvári, Peter Auer, and Yasin Abbasi-Yadkori. 2013. Evaluation and Analysis of the Performance of the EXP3 Algorithm in Stochastic Environments. In Proceedings of Machine Learning Research (PMLR, Vol. 24). PMLR, 103--116. http://proceedings.mlr.press/v24/seldin12a.html

[34]

Kostya Serebryany. 2017. OSS-Fuzz - Google's continuous fuzzing service for open source software. In USENIX Security (SEC). USENIX.

[35]

Dongdong She, Abhishek Shah, and Suman Jana. 2022. Effective Seed Scheduling for Fuzzing with Graph Centrality Analysis. In Security and Privacy (S&P). IEEE, 2194--2211.

[36]

Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.

Digital Library

[37]

William R Thompson. 1933. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 25, 3--4 (1933), 285--294.

[38]

Jonas Benedict Wagner. 2017. Elastic Program Transformations: Automatically Optimizing the Reliability/Performance Trade-off in Systems Software. Ph. D. Dissertation. EPFL.

[39]

Daimeng Wang, Zheng Zhang, Hang Zhang, Zhiyun Qian, Srikanth V. Krishnamurthy, and Nael Abu-Ghazaleh. 2021. SyzVegas: Beating Kernel Fuzzing Odds with Reinforcement Learning. In USENIX Security (SEC). USENIX, 2741--2758.

[40]

Jinghan Wang, Chengyu Song, and Heng Yin. 2021. Reinforcement Learning-based Hierarchical Seed Scheduling for Greybox Fuzzing. In Network and Distributed System Security Symposium (NDSS). The Internet Society, 17 pages.

[41]

Pengfei Wang and Xu Zhou. 2020. SoK: The Progress, Challenges, and Perspectives of Directed Greybox Fuzzing. CoRR abs/2005.11907 (2020).

[42]

Yan Wang, Peng Jia, Luping Liu, Cheng Huang, and Zhonglin Liu. 2020. A systematic review of fuzzing based on machine learning techniques. PLOS ONE 15, 8 (2020), 1--37.

[43]

Yanhao Wang, Xiangkun Jia, Yuwei Liu, Kyle Zeng, Tiffany Bao, Dinghao Wu, and Purui Su. 2020. Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization. In Network and Distributed Systems Security (NDSS). The Internet Society, 17 pages.

[44]

Maverick Woo, Sang Kil Cha, Samantha Gottlieb, and David Brumley. 2013. Scheduling Black-Box Mutational Fuzzing. In Computer and Communications Security (CCS). ACM, 511--522.

Digital Library

[45]

Wen Xu, Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. 2017. Designing New Operating Primitives to Improve Fuzzing Performance. In Computer and Communications Security (CCS). ACM, 2313--2328.

Digital Library

[46]

Tai Yue, Pengfei Wang, Yong Tang, Enze Wang, Bo Yu, Kai Lu, and Xu Zhou. 2020. EcoFuzz: Adaptive Energy-Saving Greybox Fuzzing as a Variant of the Adversarial Multi-Armed Bandit. In USENIX Security (SEC). USENIX, 2307--2324.

[47]

Michał Zalewski. 2015. American Fuzzy Lop (AFL). http://lcamtuf.coredump.cx/afl/

[48]

Gen Zhang, Pengfei Wang, Tai Yue, Xiangdong Kong, Shan Huang, Xu Zhou, and Kai Lu. 2022. MobFuzz: Adaptive Multi-objective Optimization in Gray-box Fuzzing. In Network and Distributed Systems Security (NDSS). The Internet Society, 18 pages.

[49]

Han Zheng, Jiayuan Zhang, Yuhang Huang, Zezhong Ren, He Wang, Chunjie Cao, Yuqing Zhang, Flavio Toffalini, and Mathias Payer. 2022. FishFuzz: Throwing Larger Nets to Catch Deeper Bugs. CoRR abs/2207.13393 (2022).

[50]

Peiyuan Zong, Tao Lv, Dawei Wang, Zizhuang Deng, Ruigang Liang, and Kai Chen. 2020. FuzzGuard: Filtering out Unreachable Inputs in Directed Grey-box Fuzzing through Deep Learning. In USENIX Security (SEC). USENIX, 2255--2269.

Cited By

Qian RZhang QFang CYang DLi SLi BChen Z(2024)DiPri: Distance-based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/3654440Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3654440
Fan ZYe GHu TTang Z(2024)History-driven Compiler Fuzzing via Assembling and Scheduling Bug-triggering Code Segments2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE62328.2024.00040(331-342)Online publication date: 28-Oct-2024
https://doi.org/10.1109/ISSRE62328.2024.00040

Index Terms

Make out like a (Multi-Armed) Bandit: Improving the Odds of Fuzzer Seed Scheduling with T-Scheduler
1. Computing methodologies
  1. Machine learning
2. Security and privacy
  1. Software and application security

Recommendations

AutoFuzz: automatic fuzzer-sanitizer scheduling with multi-armed bandit
Abstract
Coverage-guided fuzzing (CGF) is a widely used technique for exposing vulnerabilities in software. Despite its success, selecting the ideal CGF fuzzer or fuzzer combination for a particular program continues to pose a challenge, given that no ...
Grey-box concolic testing on binary code
ICSE '19: Proceedings of the 41st International Conference on Software Engineering

We present grey-box concolic testing, a novel path-based test case generation method that combines the best of both white-box and grey-box fuzzing. At a high level, our technique systematically explores execution paths of a program under test as in white-...
Boosting fuzzer efficiency: an information theoretic perspective
ESEC/FSE 2020: Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

In this paper, we take the fundamental perspective of fuzzing as a learning process. Suppose before fuzzing, we know nothing about the behaviors of a program P: What does it do? Executing the first test input, we learn how P behaves for this input. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASIA CCS '24: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security

July 2024

1987 pages

ISBN:9798400704826

DOI:10.1145/3634737

Chair:
Jianying Zhou,
Co-chair:
Tony Q. S. Quek,
Program Chairs:
Debin Gao,
Alvaro Cardenas
University of California, Santa Cruz, USA

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

SIGSAC: ACM Special Interest Group on Security, Audit, and Control

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2024

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ASIA CCS '24

Sponsor:

SIGSAC

ASIA CCS '24: 19th ACM Asia Conference on Computer and Communications Security

July 1 - 5, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 418 of 2,322 submissions, 18%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
104
Total Downloads

Downloads (Last 12 months)104
Downloads (Last 6 weeks)17

Reflects downloads up to 06 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qian RZhang QFang CYang DLi SLi BChen Z(2024)DiPri: Distance-based Seed Prioritization for Greybox FuzzingACM Transactions on Software Engineering and Methodology10.1145/3654440Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3654440
Fan ZYe GHu TTang Z(2024)History-driven Compiler Fuzzing via Assembling and Scheduling Bug-triggering Code Segments2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE62328.2024.00040(331-342)Online publication date: 28-Oct-2024
https://doi.org/10.1109/ISSRE62328.2024.00040

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten