research-article

Billions and billions of constraints: whitebox fuzz testing in production

Authors:

Ella Bounimova,

Patrice Godefroid,

David MolnarAuthors Info & Claims

ICSE '13: Proceedings of the 2013 International Conference on Software Engineering

Pages 122 - 131

Published: 18 May 2013 Publication History

Abstract

We report experiences with constraint-based whitebox fuzz testing in production across hundreds of large Windows applications and over 500 machine years of computation from 2007 to 2013. Whitebox fuzzing leverages symbolic execution on binary traces and constraint solving to construct new inputs to a program. These inputs execute previously uncovered paths or trigger security vulnerabilities. Whitebox fuzzing has found one-third of all file fuzzing bugs during the development of Windows 7, saving millions of dollars in potential security vulnerabilities. The technique is in use today across multiple products at Microsoft. We describe key challenges with running whitebox fuzzing in production. We give principles for addressing these challenges and describe two new systems built from these principles: SAGAN, which collects data from every fuzzing run for further analysis, and JobCenter, which controls deployment of our whitebox fuzzing infrastructure across commodity virtual machines. Since June 2010, SAGAN has logged over 3.4 billion constraints solved, millions of symbolic executions, and tens of millions of test cases generated. Our work represents the largest scale deployment of whitebox fuzzing to date, including the largest usage ever for a Satisfiability Modulo Theories (SMT) solver. We present specific data analyses that improved our production use of whitebox fuzzing. Finally we report data on the performance of constraint solving and dynamic test generation that points toward future research problems.

References

[1]

S. Bhansali, W. Chen, S. De Jong, A. Edwards, and M. Drinic. Framework for instruction-level tracing and analysis of programs. In Second International Conference on Virtual Execution Environments VEE, 2006.

Digital Library

[2]

D. Brumley, C. Hartwig, Z. Liang, J. Newsome, D. Song, and H. Yin. Automatically identifying trigger-based behavior in malware. In Botnet Detection, pages 65–88. Springer, 2008.

[3]

S. Bucur, V. Ureche, C. Zamfir, and G. Candea. Parallel symbolic execution for automated real-world software testing. In Proceedings of the sixth conference on Computer systems, EuroSys ’11, New York, NY, USA, 2011. ACM.

Digital Library

[4]

C. Cadar, D. Dunbar, and D. Engler. Klee: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI, 2008.

Digital Library

[5]

C. Cadar, V. Ganesh, P. M. Pawlowski, D. L. Dill, and D. R. Engler. EXE: Automatically Generating Inputs of Death. In ACM CCS, 2006.

Digital Library

[6]

C. Cadar, P. Godefroid, S. Khurshid, C.S. Pasareanu, K. Sen, N.Tillmann, and W. Visser. Symbolic Execution for Software Testing in Practice – Preliminary Assessment. In ICSE’2011, Honolulu, May 2011.

Digital Library

[7]

V. Chipounov, V. Kuznetsov, and G. Candea. S2E: A platform for in-vivo multi-path analysis of software systems. In ASPLOS, 2011.

Digital Library

[8]

L. de Moura and N. Bjorner. Z3: An Efficient SMT Solver. In Proceedings of TACAS’2008, volume 4963 of Lecture Notes in Computer Science, pages 337–340, Budapest, April 2008. Springer-Verlag.

Digital Library

[9]

Ch. Evans, M. Moore, and T. Ormandy. Fuzzing at scale, 2011. http: //googleonlinesecurity.blogspot.com/2011/08/fuzzing-at-scale.html.

[10]

S. Floyd and V. Jacobson. The synchronization of periodic routing messages. IEEE/ACM Trans. Netw., 2(2):122–136, April 1994.

Digital Library

[11]

J. E. Forrester and B. P. Miller. An Empirical Study of the Robustness of Windows NT Applications Using Random Testing. In Proceedings of the 4th USENIX Windows System Symposium, Seattle, August 2000.

Digital Library

[12]

T. Gallagher and D. Conger. Under the kimono of office security engineering. In CanSecWest, 2010.

[13]

P. Godefroid. Compositional Dynamic Test Generation. In Proceedings of POPL’2007 (34th ACM Symposium on Principles of Programming Languages), pages 47–54, Nice, January 2007.

Digital Library

[14]

P. Godefroid. Higher-Order Test Generation. In PLDI’2011 (ACM SIGPLAN 2011 Conference on Programming Language Design and Implementation), pages 258–269, San Jose, June 2011.

Digital Library

[15]

P. Godefroid, N. Klarlund, and K. Sen. DART: Directed Automated Random Testing. In Proceedings of PLDI’2005 (ACM SIGPLAN 2005 Conference on Programming Language Design and Implementation), pages 213–223, Chicago, June 2005.

Digital Library

[16]

P. Godefroid, S. K. Lahiri, and C. Rubio-Gonzalez. Statically Validating Must Summaries for Incremental Compositional Dynamic Test Generation. In Proceedings of SAS’2011 (18th International Static Analysis Symposium), volume 6887 of Lecture Notes in Computer Science, pages 112–128, Venice, September 2011. Springer-Verlag.

Digital Library

[17]

P. Godefroid, M.Y. Levin, and D. Molnar. Automated Whitebox Fuzz Testing. In Proceedings of NDSS’2008 (Network and Distributed Systems Security), pages 151–166, San Diego, February 2008.

[18]

M. Howard and S. Lipner. The Security Development Lifecycle. Microsoft Press, 2006.

Digital Library

[19]

B.P. Miller, L. Fredriksen, and B. So. An empirical study of the reliability of UNIX utilities. Communications of the ACM, 33(12), December 1990.

Digital Library

[20]

D. Molnar, X. C. Li, and D. A. Wagner. Dynamic test generation to find integer bugs in x86 binary linux programs. In USENIX Security Symposium, 2009.

Digital Library

[21]

B. Nagy. Finding microsoft vulnerabilities by fuzzing binary files with ruby - a new fuzzing framework. In SyScan, 2009. http://www.youtube.

[22]

com/watch?v=u--j4YY 7cg.

[23]

S. Narayanasamy, Z. Wang, J. Tigani, A. Edwards, and B. Calder. Automatically classifying benign and harmful data races using replay analysis. In Programming Languages Design and Implementation (PLDI), 2007.

Digital Library

[24]

N. Tillmann and J. de Halleux. Pex - White Box Test Generation for .NET. In Proceedings of TAP’2008 (2nd International Conference on Tests and Proofs), volume 4966 of Lecture Notes in Computer Science, pages 134–153. Springer-Verlag, April 2008.

Digital Library

Cited By

Maicus EPatel DPeveler MCutler BZhang JSherriff MHeckman SCutter PMonge A(2020)Random Input and Automated Output Generation in SubmittyProceedings of the 51st ACM Technical Symposium on Computer Science Education10.1145/3328778.3372685(1372-1372)Online publication date: 26-Feb-2020
https://dl.acm.org/doi/10.1145/3328778.3372685
Poeplau SFrancillon ABalenson D(2019)Systematic comparison of symbolic execution systemsProceedings of the 35th Annual Computer Security Applications Conference10.1145/3359789.3359796(163-176)Online publication date: 9-Dec-2019
https://dl.acm.org/doi/10.1145/3359789.3359796
Liu HLu SMusuvathi MNath S(2019)What bugs cause production cloud incidents?Proceedings of the Workshop on Hot Topics in Operating Systems10.1145/3317550.3321438(155-162)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3317550.3321438
Show More Cited By

Index Terms

Recommendations

Automatic and lightweight grammar generation for fuzz testing

Blackbox fuzz testing can only test a small portion of code when rigorously checking the well-formedness of input values. To overcome this problem, blackbox fuzz testing is performed using a grammar that delineates the format information of input ...
Model-Based Fuzz Testing
ICST '12: Proceedings of the 2012 IEEE Fifth International Conference on Software Testing, Verification and Validation

The European ITEA2 project DIAMONDS (Development and Industrial Application of Multi-Domain Security Testing Technologies) develops under the direction of Fraunhofer FOKUS, Berlin efficient and automated security test methods for security-critical, ...
Grammar-based whitebox fuzzing
PLDI '08: Proceedings of the 29th ACM SIGPLAN Conference on Programming Language Design and Implementation

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '13: Proceedings of the 2013 International Conference on Software Engineering

May 2013

1561 pages

ISBN:9781467330763

General Chair:
David Notkin,
Program Chairs:
Betty H. C. Cheng,
Klaus Pohl

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

IEEE Press

Publication History

Published: 18 May 2013

Check for updates

Qualifiers

Research-article

Conference

ICSE '13

Sponsor:

SIGSOFT

ICSE '13: 35th International Conference on Software Engineering

May 18 - 26, 2013

CA, San Francisco, USA

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
933
Total Downloads

Downloads (Last 12 months)13
Downloads (Last 6 weeks)1

Reflects downloads up to 04 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Maicus EPatel DPeveler MCutler BZhang JSherriff MHeckman SCutter PMonge A(2020)Random Input and Automated Output Generation in SubmittyProceedings of the 51st ACM Technical Symposium on Computer Science Education10.1145/3328778.3372685(1372-1372)Online publication date: 26-Feb-2020
https://dl.acm.org/doi/10.1145/3328778.3372685
Poeplau SFrancillon ABalenson D(2019)Systematic comparison of symbolic execution systemsProceedings of the 35th Annual Computer Security Applications Conference10.1145/3359789.3359796(163-176)Online publication date: 9-Dec-2019
https://dl.acm.org/doi/10.1145/3359789.3359796
Liu HLu SMusuvathi MNath S(2019)What bugs cause production cloud incidents?Proceedings of the Workshop on Hot Topics in Operating Systems10.1145/3317550.3321438(155-162)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3317550.3321438
Lukman JKe HStuardo CSuminto RKurniawan DSimon DPriambada STian CYe FLeesatapornwongsa TGupta ALu SGunawi H(2019)FlyMCProceedings of the Fourteenth EuroSys Conference 201910.1145/3302424.3303986(1-16)Online publication date: 25-Mar-2019
https://dl.acm.org/doi/10.1145/3302424.3303986
Balasubramanian DZhang ZMcDermet DKarsai GHung CPapadopoulos G(2019)Dynamic symbolic execution for the analysis of web server applications in JavaProceedings of the 34th ACM/SIGAPP Symposium on Applied Computing10.1145/3297280.3297494(2178-2185)Online publication date: 8-Apr-2019
https://dl.acm.org/doi/10.1145/3297280.3297494
Choi JJang JHan CCha SAtlee JBultan TWhittle J(2019)Grey-box concolic testing on binary codeProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00082(736-747)Online publication date: 25-May-2019
https://dl.acm.org/doi/10.1109/ICSE.2019.00082
Xu XSui YYan HXue JAtlee JBultan TWhittle J(2019)VFixProceedings of the 41st International Conference on Software Engineering10.1109/ICSE.2019.00063(512-523)Online publication date: 25-May-2019
https://dl.acm.org/doi/10.1109/ICSE.2019.00063
Kim YChoi YKim MChaudron MCrnkovic IChechik MHarman M(2018)Precise concolic unit testing of C programs using extended units and symbolic alarm filteringProceedings of the 40th International Conference on Software Engineering10.1145/3180155.3180253(315-326)Online publication date: 27-May-2018
https://dl.acm.org/doi/10.1145/3180155.3180253
Abal IMelo JStănciulescu ŞBrabrand CRibeiro MWąsowski A(2018)Variability Bugs in Highly Configurable SystemsACM Transactions on Software Engineering and Methodology10.1145/314911926:3(1-34)Online publication date: 12-Jan-2018
https://dl.acm.org/doi/10.1145/3149119
Li LLu YXue JWu PHack S(2017)Dynamic symbolic execution for polymorphismProceedings of the 26th International Conference on Compiler Construction10.1145/3033019.3033029(120-130)Online publication date: 5-Feb-2017
https://dl.acm.org/doi/10.1145/3033019.3033029
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents