research-article

Ankou: guiding grey-box fuzzing towards combinatorial difference

Authors:

Valentin J. M. Manès,

Sang Kil ChaAuthors Info & Claims

ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

Pages 1024 - 1036

https://doi.org/10.1145/3377811.3380421

Published: 01 October 2020 Publication History

Abstract

Grey-box fuzzing is an evolutionary process, which maintains and evolves a population of test cases with the help of a fitness function. Fitness functions used by current grey-box fuzzers are not informative in that they cannot distinguish different program executions as long as those executions achieve the same coverage. The problem is that current fitness functions only consider a union of data, but not their combination. As such, fuzzers often get stuck in a local optimum during their search. In this paper, we introduce Ankou, the first grey-box fuzzer that recognizes different combinations of execution information, and present several scalability challenges encountered while designing and implementing Ankou. Our experimental results show that Ankou is 1.94× and 8.0× more effective in finding bugs than AFL and Angora, respectively.

References

[1]

[n.d.]. Data Flow Sanitizer. http://clang.llvm.org/docs/DataFlowSanitizer.html.

[2]

[n.d.]. Fidgety AFL. https://groups.google.com/forum/#!topic/afl-users/fOPeb62FZUg.

[3]

[n.d.]. The Go Programming Language. https://golang.org.

[4]

[n.d.]. Gonum Numeric Library. https://www.gonum.org.

[5]

[n.d.]. LibFuzzer. http://llvm.org/docs/LibFuzzer.html.

[6]

Mike Aizatsky, Kostya Serebryany, Oliver Chang, Abhishek Arya, and Meredith Whittaker. 2016. Announcing OSS-Fuzz: Continuous Fuzzing for Open Source Software. Google Testing Blog.

[7]

Andrea Arcuri and Lionel Briand. 2011. A practical guide for using statistical tests to assess randomized algorithms in software engineering. 1--10.

[8]

Raman Arora, Andy Cotter, and Nati Srebro. 2013. Stochastic optimization of PCA with capped MSG. In Advances in Neural Information Processing Systems. 1815--1823.

[9]

Cornelius Aschermann, Sergej Schumilo, Tim Blazytko, Robert Gawlik, and Thorsten Holz. 2019. REDQUEEN: Fuzzing with Input-to-State Correspondence. In Proceedings of the Network and Distributed System Security Symposium.

[10]

Marcel Böhme, Van-Thuan Pham, Manh-Dung Nguyen, and Abhik Roychoudhury. 2017. Directed Greybox Fuzzing. In Proceedings of the ACM Conference on Computer and Communications Security. 2329--2344.

Digital Library

[11]

Marcel Böhme, Van-Thuan Pham, and Abhik Roychoudhury. 2016. Coveragebased Greybox Fuzzing as Markov Chain. In Proceedings of the ACM Conference on Computer and Communications Security. 1032--1043.

[12]

Sang Kil Cha, Maverick Woo, and David Brumley. 2015. Program-Adaptive Mutational Fuzzing. In Proceedings of the IEEE Symposium on Security and Privacy. 725--741.

Digital Library

[13]

Hongxu Chen, Yinxing Xue, Yuekang Li, Bihuan Chen, Xiaofei Xie, Xiuheng Wu, and Yang Liu. 2018. Hawkeye: Towards a Desired Directed Grey-box Fuzzer. In Proceedings of the ACM Conference on Computer and Communications Security. 2095--2108.

Digital Library

[14]

Peng Chen and Hao Chen. 2018. Angora: Efficient Fuzzing by Principled Search. In Proceedings of the IEEE Symposium on Security and Privacy. 855--869.

[15]

Jaeseung Choi, Joonun Jang, Choongwoo Han, and Sang Kil Cha. 2019. Grey-box Concolic Testing on Binary Code. In Proceedings of the International Conference on Software Engineering. 736--747.

Digital Library

[16]

Paolo Ciaccia, Marco Patella, and Pavel Zezula. 1997. M-Tree: An Efficient Access Method for Similarity Search in Metric Spaces. In Proceedings of the International Conference on Very Large Data Bases. 426--435.

[17]

Weidong Cui, Marcus Peinado, Sang Kil Cha, Yanick Fratantonio, and Vasileios P. Kemerlis. 2016. RETracer: Triaging Crashes by Reverse Execution from Partial Memory Dumps. In Proceedings of the International Conference on Software Engineering. 820--831.

[18]

Al Danial. [n.d.]. Count Lines of Code: Coverage Tool. http://cloc.sourceforge.net/.

[19]

Shawn Embleton, Sherri Sparks, and Ryan Cunningham. 2006. "Sidewinder": An Evolutionary Guidance System for Malicious Input Crafting. In Proceedings of the Black Hat USA.

[20]

Robert Feldt, Simon Poulding, David Clark, and Shin Yoo. 2016. Test Set Diameter: Quantifying the Diversity of Sets of Test Cases. In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation. 223--233.

[21]

John GF Francis. 1961. The QR transformation a unitary analogue to the LR transformation. Comput. J. 4, 3 (1961), 265--271.

[22]

Shuitao Gan, Chao Zhang, Xiaojun Qin, Xuwen Tu, Kang Li, Zhongyu Pei, and Zuoning Chen. 2018. CollAFL: Path Sensitive Fuzzing. In Proceedings of the IEEE Symposium on Security and Privacy. 660--677.

[23]

Patrice Godefroid, Hila Peleg, and Rishabh Singh. 2017. Learn&Fuzz: Machine Learning for Input Fuzzing. In Proceedings of the International Conference on Automated Software Engineering. 50--59.

[24]

Nathan Halko, Per-Gunnar Martinsson, and Joel A Tropp. 2011. Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions. SIAM review 53, 2 (2011), 217--288.

[25]

HyungSeok Han and Sang Kil Cha. 2017. IMF: Inferred Model-based Fuzzer. In Proceedings of the ACM Conference on Computer and Communications Security. 2345--2358.

Digital Library

[26]

HyungSeok Han, DongHyeon Oh, and Sang Kil Cha. 2019. CodeAlchemist: Semantics-Aware Code Generation to Find Vulnerabilities in JavaScript Engines. In Proceedings of the Network and Distributed System Security Symposium.

[27]

Ian T. Jolliffe. 2011. Principal Component Analysis. Springer.

[28]

George Klees, Andrew Ruef, Benji Cooper, Shiyi Wei, and Michael Hicks. 2018. Evaluating fuzz testing. In Proceedings of the ACM Conference on Computer and Communications Security. 2123--2138.

Digital Library

[29]

lafintel. 2016. Circumventing Fuzzing Roadblocks with Compiler Transformations. https://lafintel.wordpress.com/2016/08/15/circumventing-fuzzing-roadblocks-with-compiler-transformations/.

[30]

Joel Lehman and Kenneth O Stanley. 2008. Exploiting Open-Endedness to Solve Problems through the Search for Novelty. In Proceedings of the International Conference on Artificial Life. 329--336.

[31]

Caroline Lemieux, Rohan Padhye, Koushik Sen, and Dawn Song. 2018. PerfFuzz: Automatically Generating Pathological Inputs. In Proceedings of the International Symposium on Software Testing and Analysis. 254--265.

Digital Library

[32]

Caroline Lemieux and Koushik Sen. 2018. FairFuzz: A Targeted Mutation Strategy for Increasing Greybox Fuzz Testing Coverage. In Proceedings of the International Conference on Automated Software Engineering. 475--485.

Digital Library

[33]

Yuekang Li, Bihuan Chen, Mahinthan Chandramohan, Shang-Wei Lin, Yang Liu, and Alwen Tiu. 2017. Steelix: Program-state Based Binary Fuzzing. In Proceedings of the International Symposium on Foundations of Software Engineering. 627--637.

Digital Library

[34]

Yuekang Li, Yinxing Xue, Hongxu Chen, Xiuheng Wu, Cen Zhang, Xiaofei Xie, Haijun Wang, and Yang Liu. 2019. Cerebro: Context-Aware Adaptive Fuzzing for Effective Vulnerability Detection. In Proceedings of the International Symposium on Foundations of Software Engineering. 533--544.

Digital Library

[35]

Daniel Liew, Cristian Cadar, Alastair F Donaldson, and J Ryan Stinnett. 2019. Just Fuzz It: Solving Floating-Point Constraints using Coverage-Guided Fuzzing. In Proceedings of the International Symposium on Foundations of Software Engineering. 521--532.

Digital Library

[36]

Jorge Pinilla López. 2019. Improving fuzzing performance using hardware-accelerated hashing and PCA guidance. https://cs.anu.edu.au/courses/csprojects/19S1/reports/u6759601_report.pdf.

[37]

Valentin J. M. Manès, HyungSeok Han, Choongwoo Han, Sang Kil Cha, Manuel Egele, Edward J. Schwartz, and Maverick Woo. 2019. The Art, Science, and Engineering of Fuzzing: A Survey. IEEE Transactions on Software Engineering (2019).

[38]

Valentin J. M. Manès, Soomin Kim, and Sang Kil Cha. 2020. Ankou. https://github.com/SoftSec-KAIST/Ankou.

[39]

Björn Mathis, Rahul Gopinath, Michaël Mera, Alexander Kampmann, Matthias Höschele, and Andreas Zeller. 2019. Parser-directed Fuzzing. In Proceedings of the ACM Conference on Programming Language Design and Implementation. 548--560.

[40]

Phil McMinn. 2011. Search-Based Software Testing: Past, Present and Future. In Proceedings of the IEEE International Conference on Software Testing, Verification and Validation Workshops. 153--163.

Digital Library

[41]

David Molnar, Xue Cong Li, and David A. Wagner. 2009. Dynamic Test Generation to Find Integer Bugs in x86 Binary Linux Programs. In Proceedings of the USENIX Security Symposium. 67--82.

Digital Library

[42]

Jiazhong Nie, Wojciech Kotłowski, and Manfred K. Warmuth. 2013. Online PCA with Optimal Regrets. In Proceedings of the International Conference on Algorithmic Learning Theory. 98--112.

[43]

Shankara Pailoor, Andrew Aday, and Suman Jana. 2018. MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation. In Proceedings of the USENIX Security Symposium. 729--743.

[44]

Jibesh Patra and Michael Pradel. 2016. Learning to Fuzz: Application-Independent Fuzz Testing with Probabilistic, Generative Models of Input Data. Technical Report TUD-CS-2016-14664. TU Darmstadt.

[45]

Van-Thuan Pham, Marcel Böhme, Andrew E. Santosa, Alexandru R. Căciulescu, and Abhik Roychoudhury. 2019. Smart Greybox Fuzzing. IEEE Transactions on Software Engineering (2019).

[46]

Sanjay Rawat, Vivek Jain, Ashish Kumar, Lucian Cojocar, Cristiano Giuffrida, and Herbert Bos. 2017. VUzzer: Application-aware Evolutionary Fuzzing. In Proceedings of the Network and Distributed System Security Symposium.

[47]

Alexandre Rebert, Sang Kil Cha, Thanassis Avgerinos, Jonathan Foote, David Warren, Gustavo Grieco, and David Brumley. 2014. Optimizing Seed Selection for Fuzzing. In Proceedings of the USENIX Security Symposium. 861--875.

[48]

Sam Roweis. 1997. EM Algorithms for PCA and SPCA. In Proceedings of the 1997 Conference on Advances in Neural Information Processing Systems. 626--632.

[49]

Konstantin Serebryany, Derek Bruening, Alexander Potapenko, and Dmitriy Vyukov. 2012. AddressSanitizer: A Fast Address Sanity Checker. In Proceedings of the USENIX Annual Technical Conference. 309--318.

[50]

Ohad Shamir. 2015. A stochastic PCA and SVD algorithm with an exponential convergence rate. In International Conference on Machine Learning. 144--152.

[51]

Heyuan Shi, Runzhe Wang, Ying Fu, Mingzhe Wang, Xiaohai Shi, Xun Jiao, Houbing Song, Yu Jiang, and Jiaguang Sun. 2019. Industry Practice of Coverage-Guided Enterprise Linux Kernel Fuzzing. In Proceedings of the International Symposium on Foundations of Software Engineering. 986--995.

Digital Library

[52]

Gilbert Strang. 2003. Introduction to Linear Algebra (3 ed.). Wellesley-Cambridge Press.

[53]

Charles F Van Loan and Gene H Golub. 1983. Matrix computations. Johns Hopkins University Press.

[54]

Manfred K. Warmuth and Dima Kuzmin. 2008. Randomized Online PCA Algorithms with Regret Bounds that are Logarithmic in the Dimension. Journal of Machine Learning Research 9 (2008), 2287--2320.

[55]

Maverick Woo, Sang Kil Cha, Samantha Gottlieb, and David Brumley. 2013. Scheduling Black-box Mutational Fuzzing. In Proceedings of the ACM Conference on Computer and Communications Security. 511--522.

Digital Library

[56]

Jun Xu, Dongliang Mu, Ping Chen, Xinyu Xing, Pei Wang, and Peng Liu. 2016. CREDAL: Towards Locating a Memory Corruption Vulnerability with Your Core Dump. In Proceedings of the ACM Conference on Computer and Communications Security. 529--540.

Digital Library

[57]

Wei You, Xuwei Liu, Shiqing Ma, David Perry, Xiangyu Zhang, and Bin Liang. 2019. SLF: Fuzzing Without Valid Seed Inputs. In Proceedings of the International Conference on Software Engineering. 712--723.

Digital Library

[58]

Michal Zalewski. [n.d.]. American Fuzzy Lop. http://lcamtuf.coredump.cx/afl/.

[59]

Michal Zalewski. [n.d.]. Technical "whitepaper" for afl-fuzz. http://lcamtuf.coredump.cx/afl/technical_details.txt.

Cited By

Kummita SZhang ZBodden EWei SFilkov VRay BZhou M(2024)Visualizing and Understanding the Internals of FuzzingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695284(2199-2204)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695284
Kummita SMiao MBodden EWei SBöhme MNoller YSzekeres L(2024)Visualization Task Taxonomy to Understand the Fuzzing Internals (Registered Report)Proceedings of the 3rd ACM International Fuzzing Workshop10.1145/3678722.3685530(13-22)Online publication date: 13-Sep-2024
https://dl.acm.org/doi/10.1145/3678722.3685530
Fang HZhang KYu DZhang YChristakis MPradel M(2024)DDGF: Dynamic Directed Greybox Fuzzing with Path ProfilingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680324(832-843)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680324
Show More Cited By

Index Terms

Ankou: guiding grey-box fuzzing towards combinatorial difference
1. Security and privacy
  1. Software and application security
    1. Software security engineering
2. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

JFuzz: A Tool for Automated Java Unit Testing Based on Data Mutation and Metamorphic Testing Methods
TSA '15: Proceedings of the 2015 Second International Conference on Trustworthy Systems and Their Applications

Automated test framework plays a significant role in test driven software development methodologies. The XUnit family of testing tools has been widely used in the industry. However, they are weak in supporting test case generation and test result ...
Automated test generation for OpenCL kernels using fuzzing and constraint solving
GPGPU '20: Proceedings of the 13th Annual Workshop on General Purpose Processing using Graphics Processing Unit

Graphics Processing Units (GPUs) are massively parallel processors offering performance acceleration and energy efficiency unmatched by current processors (CPUs) in computers. These advantages along with recent advances in the programmability of GPUs ...
A Static Approach to Prioritizing JUnit Test Cases

Test case prioritization is used in regression testing to schedule the execution order of test cases so as to expose faults earlier in testing. Over the past few years, many test case prioritization techniques have been proposed in the literature. Most ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering

June 2020

1640 pages

ISBN:9781450371216

DOI:10.1145/3377811

General Chairs:
Gregg Rothermel
North Carolina State University
,
Doo-Hwan Bae
KAIST, South Korea

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

In-Cooperation

KIISE: Korean Institute of Information Scientists and Engineers
IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Funding Sources

Korea government (MSIT)

Conference

ICSE '20

Sponsor:

SIGSOFT

ICSE '20: 42nd International Conference on Software Engineering

June 27 - July 19, 2020

Seoul, South Korea

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

36
Total Citations
View Citations
577
Total Downloads

Downloads (Last 12 months)55
Downloads (Last 6 weeks)4

Reflects downloads up to 25 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kummita SZhang ZBodden EWei SFilkov VRay BZhou M(2024)Visualizing and Understanding the Internals of FuzzingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695284(2199-2204)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695284
Kummita SMiao MBodden EWei SBöhme MNoller YSzekeres L(2024)Visualization Task Taxonomy to Understand the Fuzzing Internals (Registered Report)Proceedings of the 3rd ACM International Fuzzing Workshop10.1145/3678722.3685530(13-22)Online publication date: 13-Sep-2024
https://dl.acm.org/doi/10.1145/3678722.3685530
Fang HZhang KYu DZhang YChristakis MPradel M(2024)DDGF: Dynamic Directed Greybox Fuzzing with Path ProfilingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680324(832-843)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680324
Wu WNongpoh BNour MMarcozzi MBardin SHauser C(2024)Fine-grained Coverage-based FuzzingACM Transactions on Software Engineering and Methodology10.1145/358715833:5(1-41)Online publication date: 4-Jun-2024
https://dl.acm.org/doi/10.1145/3587158
Peng XJia PFan XLiu J(2024)ENZZ: Effective N-gram coverage assisted fuzzing with nearest neighboring branch estimationInformation and Software Technology10.1016/j.infsof.2024.107582(107582)Online publication date: Sep-2024
https://doi.org/10.1016/j.infsof.2024.107582
Ahsan FAnwer F(2024)A systematic literature review on software security testing using metaheuristicsAutomated Software Engineering10.1007/s10515-024-00433-031:2Online publication date: 23-May-2024
https://doi.org/10.1007/s10515-024-00433-0
Wang DLi YZhang ZChen KCalandrino JTroncoso C(2023)CarpetFuzzProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620345(1919-1936)Online publication date: 9-Aug-2023
https://dl.acm.org/doi/10.5555/3620237.3620345
Lipp SElsner DKacianka SPretschner ABöhme MBanescu SJust RFraser G(2023)Green Fuzzing: A Saturation-Based Stopping Criterion using Vulnerability PredictionProceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3597926.3598043(127-139)Online publication date: 12-Jul-2023
https://dl.acm.org/doi/10.1145/3597926.3598043
Li SSu Z(2023)Accelerating Fuzzing through Prefix-Guided ExecutionProceedings of the ACM on Programming Languages10.1145/35860277:OOPSLA1(1-27)Online publication date: 6-Apr-2023
https://dl.acm.org/doi/10.1145/3586027
Fioraldi AMantovani AMaier DBalzarotti D(2023)Dissecting American Fuzzy Lop: A FuzzBench EvaluationACM Transactions on Software Engineering and Methodology10.1145/358059632:2(1-26)Online publication date: 29-Mar-2023
https://dl.acm.org/doi/10.1145/3580596
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents