skip to main content
10.1145/3338906.3338957acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article
Public Access

Compiler bug isolation via effective witness test program generation

Published: 12 August 2019 Publication History

Abstract

Compiler bugs are extremely harmful, but are notoriously difficult to debug because compiler bugs usually produce few debugging information. Given a bug-triggering test program for a compiler, hundreds of compiler files are usually involved during compilation, and thus are suspect buggy files. Although there are lots of automated bug isolation techniques, they are not applicable to compilers due to the scalability or effectiveness problem. To solve this problem, in this paper, we transform the compiler bug isolation problem into a search problem, i.e., searching for a set of effective witness test programs that are able to eliminate innocent compiler files from suspects. Based on this intuition, we propose an automated compiler bug isolation technique, DiWi, which (1) proposes a heuristic-based search strategy to generate such a set of effective witness test programs via applying our designed witnessing mutation rules to the given failing test program, and (2) compares their coverage to isolate bugs following the practice of spectrum-based bug isolation. The experimental results on 90 real bugs from popular GCC and LLVM compilers show that DiWi effectively isolates 66.67%/78.89% bugs within Top-10/Top-20 compiler files, significantly outperforming state-of-the-art bug isolation techniques.

References

[1]
Accessed: 2019. Clang Libtooling library. http://clang.llvm.org/docs/LibTooling. html.
[2]
Accessed: 2019. GCC. https://gcc.gnu.org.
[3]
Accessed: 2019. GCC bug repository. https://gcc.gnu.org/bugzilla/.
[4]
Accessed: 2019. Gcov. https://gcc.gnu.org/onlinedocs/gcc/Gcov.html.
[5]
Accessed: 2019. LLVM. https://llvm.org.
[6]
Accessed: 2019. LLVM bug repository. https://bugs.llvm.org.
[7]
Rui Abreu, Peter Zoeteweij, and Arjan JC Van Gemund. 2007. On the accuracy of spectrum-based fault localization. In TAICPART-MUTATION 2007. 89–98.
[8]
Shay Artzi, Julian Dolby, Frank Tip, and Marco Pistoia. 2010. Directed test generation for e�ective fault localization. In ISSTA. 49–60.
[9]
Tien-Duy B Le, David Lo, Claire Le Goues, and Lars Grunske. 2016. A learning-torank based fault localization approach using likely invariants. In ISSTA. 177–188.
[10]
Benoit Baudry, Franck Fleurey, and Yves Le Traon. 2006. Improving test suites for e�cient fault localization. In ICSE. 82–91.
[11]
José Campos, Rui Abreu, Gordon Fraser, and Marcelo d’Amorim. 2013. Entropybased test generation for improved fault localization. In ASE. 257–267.
[12]
Jacqueline M. Caron and Peter A. Darnell. 1990. Bug�nd: A Tool for Debugging Optimizing Compilers. SIGPLAN Notices 25, 1 (1990), 17–22.
[13]
Bor-Yuh Evan Chang, Adam Chlipala, George C. Necula, and Robert R. Schneck. 2005. Type-based veri�cation of assembly language for compiler debugging. In TLDI. 91–102.
[14]
Junjie Chen. 2018. Learning to accelerate compiler testing. In ICSE: Companion Proceeedings. 472–475.
[15]
Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, and Bing Xie. 2017. Learning to prioritize test programs for compiler testing. In ICSE. 700–711.
[16]
Junjie Chen, Yanwei Bai, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2016. Test case prioritization for compilers: A text-vector based approach. In ICST. 266–277.
[17]
Junjie Chen, Wenxiang Hu, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and Bing Xie. 2016. An empirical comparison of compiler testing techniques. In ICSE. 180–190.
[18]
Junjie Chen, Guancheng Wang, Dan Hao, Yingfei Xiong, Hongyu Zhang, Lu Zhang, and XIE Bing. 2018. Coverage Prediction for Accelerating Compiler Testing. TSE (2018). to appear.
[19]
Yang Chen, Alex Groce, Chaoqiang Zhang, Weng-Keen Wong, Xiaoli Fern, Eric Eide, and John Regehr. 2013. Taming compiler fuzzers. In PLDI, Vol. 48. 197–208.
[20]
Yuting Chen, Ting Su, Chengnian Sun, Zhendong Su, and Jianjun Zhao. 2016. Coverage-directed Di�erential Testing of JVM Implementations. In PLDI. 85–99.
[21]
Vidroha Debroy and W Eric Wong. 2010. Using mutation to automatically suggest �xes for faulty programs. In ICST. 65–74.
[22]
Nicholas DiGiuseppe and James A Jones. 2011. On the in�uence of multiple faults on coverage-based fault localization. In ISSTA. 210–220.
[23]
Yadolah Dodge. 2006. The Oxford dictionary of statistical terms. Oxford University Press.
[24]
Alastair F. Donaldson, Hugues Evrard, Andrei Lascu, and Paul Thomson. 2017. Automated Testing of Graphics Shader Compilers. Proc. ACM Program. Lang. 1, OOPSLA (2017), 93:1–93:29.
[25]
Robert Feldt, Simon Poulding, David Clark, and Shin Yoo. 2016. Test set diameter: Quantifying the diversity of sets of test cases. In ICST. 223–233.
[26]
Robert Feldt, Richard Torkar, Tony Gorschek, and Wasif Afzal. 2008. Searching for cognitively diverse tests: Towards universal test diversity metrics. In ICSTW. 178–186.
[27]
Zachary P Fry and Westley Weimer. 2010. A human study of fault localization accuracy. In ICSM. 1–10.
[28]
Ali Ghanbari, Samuel Benton, and Lingming Zhang. 2019. Practical Program Repair via Bytecode Mutation. In ISSTA. to appear.
[29]
Alex Groce, Chaoqiang Zhang, Eric Eide, Yang Chen, and John Regehr. 2012. Swarm testing. In ISSTA. 78–88.
[30]
Chris Hathhorn, Chucky Ellison, and Grigore Roşu. 2015. De�ning the Unde- �nedness of C. In PLDI. 336–345.
[31]
K. Scott Hemmert, Justin L. Tripp, Brad L. Hutchings, and Preston A. Jackson. 2003. Source Level Debugger for the Sea Cucumber Synthesizing Compiler. In FCCM. 228.
[32]
Satia Herfert, Jibesh Patra, and Michael Pradel. 2017. Automatically Reducing Tree-structured Test Inputs. In ASE. 861–871.
[33]
Christian Holler, Kim Herzig, and Andreas Zeller. 2012. Fuzzing with code fragments. In USENIX Security. 445–458.
[34]
Josie Holmes and Alex Groce. 2018. Causal distance-metric-based assistance for debugging after compiler fuzzing. In ISSRE. 166–177.
[35]
Shin Hong, Byeongcheol Lee, Taehoon Kwak, Yiru Jeon, Bongsuk Ko, Yunho Kim, and Moonzoo Kim. 2015. Mutation-based fault localization for real-world multilingual programs. In ASE. 464–475.
[36]
Reyhaneh Jabbarvand and Sam Malek. 2017. µDroid: an energy-aware mutation testing framework for Android. In FSE. 208–219.
[37]
Dennis Je�rey, Neelam Gupta, and Rajiv Gupta. 2008. Fault Localization Using Value Replacement. In ISSTA. 167–178.
[38]
James A Jones and Mary Jean Harrold. 2005. Empirical evaluation of the tarantula automatic fault-localization technique. In ASE. 273–282.
[39]
René Just. 2014. The Major mutation framework: E�cient and scalable mutation analysis for Java. In ISSTA. 433–436.
[40]
Robert E. Kass, Bradley P. Carlin, Andrew Gelman, and Radford M. Neal. 1998. Markov Chain Monte Carlo in Practice: A Roundtable Discussion. American Statistician 52, 2 (1998), 93–100.
[41]
Nico Krebs and Lothar Schmitz. 2014. Jaccie: A Java-based compiler–compiler for generating, visualizing and debugging compiler components. Science of Computer Programming 79 (2014), 101–115.
[42]
Stephen Kyle, Hugh Leather, Björn Franke, Dave Butcher, and Stuart Monteith. 2015. Application of Domain-aware Binary Fuzzing to Aid Android Virtual Machine Testing. In VEE. 121–132.
[43]
Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. In PLDI. 216–226.
[44]
Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding Deep Compiler Bugs via Guided Stochastic Program Mutation. In OOPSLA. 386–399.
[45]
Wei Le and Mary Lou So�a. 2010. Path-based fault correlations. In FSE. 307–316.
[46]
Wei Le and Mary Lou So�a. 2011. Generating analyses for detecting faults in path segments. In ISSTA. 320–330.
[47]
Wei Le and Mary Lou So�a. 2013. Marple: Detecting faults in path segments using automatically generated analyses. TOSEM 22, 3 (2013), 18.
[48]
Claire Le Goues, Michael Dewey-Vogt, Stephanie Forrest, and Westley Weimer. 2012. A systematic study of automated program repair: Fixing 55 out of 105 bugs for $8 each. In ICSE. 3–13.
[49]
Juneyoung Lee, Yoonseung Kim, Youngju Song, Chung-Kil Hur, Sanjoy Das, David Majnemer, John Regehr, and Nuno P Lopes. 2017. Taming unde�ned behavior in LLVM. In PLDI. 633–647.
[50]
Xia Li, Wei Li, Yuqun Zhang, and Lingming Zhang. 2019. DeepFL: Integrating Multiple Fault Diagnosis Dimensions for Deep Fault Localization. In ISSTA. to appear.
[51]
Xia Li and Lingming Zhang. 2017. Transforming Programs and Tests in Tandem for Fault Localization. In OOPSLA. 92:1–92:30.
[52]
Ben Liblit, Mayur Naik, Alice X. Zheng, Alex Aiken, and Michael I. Jordan. 2005. Scalable Statistical Bug Isolation. In PLDI. 15–26.
[53]
Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F. Donaldson. 2015. Many-core Compiler Fuzzing. In PLDI. 65–76.
[54]
Bing Liu, Shiva Nejati, Lionel C Briand, et al. 2017. Improving fault localization for Simulink models using search-based testing and prediction models. In SANER. 359–370.
[55]
Wes Masri. 2015. Automated Fault Localization: Advances and Challenges. In Advances in Computers. Vol. 99. 103–156.
[56]
Wes Masri and Rawad Abou Assi. 2010. Cleansing test suites from coincidental correctness to enhance fault-localization. In ICST. 165–174.
[57]
William M McKeeman. 1998. Di�erential testing for software. Digital Technical Journal 10, 1 (1998), 100–107.
[58]
Hong Mei and Lu Zhang. 2018. Can big data bring a breakthrough for software automation? SCIENCE CHINA Information Sciences 61, 5 (2018), 056101:1–056101:3.
[59]
Martin Monperrus. 2018. Automatic software repair: a bibliography. CSUR 51, 1 (2018), 17:1–17:24.
[60]
Seokhyeon Moon, Yunho Kim, Moonzoo Kim, and Shin Yoo. 2014. Ask the mutants: Mutating faulty programs for fault localization. In ICST. 153–162.
[61]
Francisco Gomes de Oliveira Neto, Robert Feldt, Linda Erlenhov, and José Benardi de Souza Nunes. 2018. Visualizing test diversity to support test optimisation. arXiv preprint arXiv:1807.05593 (2018).
[62]
Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. 2013. Sem�x: Program repair via semantic analysis. In ICSE. 772–781.
[63]
Flemming Nielson, Hanne R. Nielson, and Chris Hankin. 1999. Principles of program analysis. Springer Verlag Berlin (1999).
[64]
Kazunori Ogata, Tamiya Onodera, Kiyokuni Kawachiya, Hideaki Komatsu, and Toshio Nakatani. 2006. Replay compilation: improving debuggability of a just-intime compiler. In OOPSLA. 241–252.
[65]
Kai Pan, Sunghun Kim, and E James Whitehead. 2009. Toward an understanding of bug �x patterns. EMSE 14, 3 (2009), 286–315.
[66]
Mike Papadakis and Yves Le Traon. 2012. Using mutants to locate" unknown" faults. In ICST. 691–700.
[67]
Mike Papadakis and Yves Le Traon. 2015. Metallaxis-FL: Mutation-based Fault Localization. STVR 25, 5-7 (2015), 605–628.
[68]
Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D. Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and Improving Fault Localization. In ICSE. 609–620.
[69]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In PLDI, Vol. 47. 335–346.
[70]
Manos Renieres and Steven P Reiss. 2003. Fault localization with nearest neighbor queries. In ASE. 30–39.
[71]
Jeremias Rö ler, Gordon Fraser, Andreas Zeller, and Alessandro Orso. 2012. Isolating failure causes through test case generation. In ISSTA. 309–319. ESEC/FSE ’19, August 26–30, 2019, Tallinn, Estonia Junjie Chen, Jiaqi Han, Peiyi Sun, Lingming Zhang, Dan Hao, and Lu Zhang
[72]
Raul Santelices, James A Jones, Yanbing Yu, and Mary Jean Harrold. 2009. Lightweight fault-localization using multiple coverage types. In ICSE. 56–66.
[73]
Anthony M. Sloane. 1999. Debugging Eli-Generated Compilers With Noosa. In CC. 17–31.
[74]
Jeongju Sohn and Shin Yoo. 2017. FLUCCS: using code and change metrics to improve fault localization. In ISSTA. 273–283.
[75]
Chengnian Sun, Vu Le, Qirun Zhang, and Zhendong Su. 2016. Toward Understanding Compiler Bugs in GCC and LLVM. In ISSTA. 294–305.
[76]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided program reduction. In ICSE. 361–371.
[77]
Shaowei Wang, David Lo, Lingxiao Jiang, Hoong Chuin Lau, et al. 2011. Searchbased fault localization. In ASE. 556–559.
[78]
Xi Wang, Nickolai Zeldovich, M Frans Kaashoek, and Armando Solar-Lezama. 2013. Towards optimization-safe systems: Analyzing the impact of unde�ned behavior. In SOSP. 260–275.
[79]
Frank Wilcoxon, SK Katti, and Roberta A Wilcox. 1970. Critical values and probability levels for the Wilcoxon rank sum test and the Wilcoxon signed rank test. Selected tables in mathematical statistics 1 (1970), 171–259.
[80]
W. Eric Wong, Ruizhi Gao, Yihao Li, Rui Abreu, and Franz Wotawa. 2016. A Survey on Software Fault Localization. TSE 42, 8 (2016), 707–740.
[81]
Jifeng Xuan and Martin Monperrus. 2014. Test case puri�cation for improving fault localization. In FSE. 52–63.
[82]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In PLDI. 283–294.
[83]
Andreas Zeller. 2002. Isolating cause-e�ect chains from computer programs. In FSE. 1–10.
[84]
Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failureinducing input. TSE 28, 2 (2002), 183–200.
[85]
Lingming Zhang, Miryung Kim, and Sarfraz Khurshid. 2011. Localizing failureinducing program edits based on spectrum information. In ICSM. 23–32.
[86]
Lingming Zhang, Tao Xie, Lu Zhang, Nikolai Tillmann, Jonathan De Halleux, and Hong Mei. 2010. Test generation via dynamic symbolic execution for mutation testing. In ICSM. 1–10.
[87]
Lingming Zhang, Lu Zhang, and Sarfraz Khurshid. 2013. Injecting mechanical faults to localize developer faults for evolving software. In OOPSLA. 765–784.
[88]
Mengshi Zhang, Xia Li, Lingming Zhang, and Sarfraz Khurshid. 2017. Boosting Spectrum-based Fault Localization Using PageRank. In ISSTA. 261–272.
[89]
Qirun Zhang, Chengnian Sun, and Zhendong Su. 2017. Skeletal Program Enumeration for Rigorous Compiler Testing. In PLDI. 347–361.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEC/FSE 2019: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering
August 2019
1264 pages
ISBN:9781450355728
DOI:10.1145/3338906
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bug Isolation
  2. Compiler Debugging
  3. Test Program Generation

Qualifiers

  • Research-article

Funding Sources

  • National Key Research and Development Program of China
  • National Natural Science Foundation of China
  • Amazon
  • NSF

Conference

ESEC/FSE '19
Sponsor:

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)230
  • Downloads (Last 6 weeks)28
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media