skip to main content
10.1145/2635868.2635898acmconferencesArticle/Chapter ViewAbstractPublication PagesfseConference Proceedingsconference-collections
research-article

The plastic surgery hypothesis

Published: 11 November 2014 Publication History

Abstract

Recent work on genetic-programming-based approaches to automatic program patching have relied on the insight that the content of new code can often be assembled out of fragments of code that already exist in the code base. This insight has been dubbed the plastic surgery hypothesis; successful, well-known automatic repair tools such as GenProg rest on this hypothesis, but it has never been validated. We formalize and validate the plastic surgery hypothesis and empirically measure the extent to which raw material for changes actually already exists in projects. In this paper, we mount a large-scale study of several large Java projects, and examine a history of 15,723 commits to determine the extent to which these commits are graftable, i.e., can be reconstituted from existing code, and find an encouraging degree of graftability, surprisingly independent of commit size and type of commit. For example, we find that changes are 43% graftable from the exact version of the software being changed. With a view to investigating the difficulty of finding these grafts, we study the abundance of such grafts in three possible sources: the immediately previous version, prior history, and other projects. We also examine the contiguity or chunking of these grafts, and the degree to which grafts can be found in the same file. Our results are quite promising and suggest an optimistic future for automatic program patching methods that search for raw material in already extant code in the project being patched.

References

[1]
Andrea Arcuri, David Robert White, John A. Clark, and Xin Yao. Multi-objective improvement of software using coevolution and smart seeding. In 7th International Conference on Simulated Evolution and Learning (SEAL 2008), pages 61–70, Melbourne, Australia, December 2008. Springer.
[2]
Andrea Arcuri and Xin Yao. A novel co-evolutionary approach to automatic software bug fixing. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC’08), pages 162– 168, Hongkong, China, June 2008.
[3]
Brenda S Baker. A program for identifying duplicated code. In Computer Science and Statistics 24: Proceedings of the 24th Symposium on the Interface, pages 49–49, 1993.
[4]
Ira D. Baxter, Andrew Yahin, Leonardo Mendonça de Moura, Marcelo Sant’Anna, and Lorraine Bier. Clone detection using abstract syntax trees. In International Conference on Software Maintenance (ICSE’98), pages 368–377, 1998.
[5]
Stefan Bellon, Rainer Koschke, Giuliano Antoniol, Jens Krinke, and Ettore Merlo. Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering, 33(9):577–591, 2007.
[6]
Yoav Benjamini and Yosef Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289–300, 1995.
[7]
Yuriy Brun, Earl Barr, Ming Xiao, Claire Le Goues, and Prem Devanbu. Evolution vs. intelligent design in program patching. Technical Report https://escholarship.org/ uc/item/3z8926ks, UC Davis: College of Engineering, 2013.
[8]
S Carter, R. Frank, and D.S.W. Tansley. Clone detection in telecommunications software systems: A neural net approach. In Proc. Int. Workshop on Application of Neural Networks to Telecommunications, pages 273–287, 1993.
[9]
Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik. Angelic debugging. In Proceedings of the 33rd International Conference on Software Engineering, ICSE ’11, pages 121–130, Honolulu, HI, USA, 2011. ACM.
[10]
Marios Fokaefs, Nikolaos Tsantalis, Eleni Stroulia, and Alexander Chatzigeorgiou. Identification and application of extract class refactorings in object-oriented systems. Journal of Systems and Software, 85(10):2241 – 2260, 2012.
[11]
Mark Gabel and Zhendong Su. A study of the uniqueness of source code. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, FSE ’10, pages 147–156. ACM, 2010.
[12]
Ah-Rim Han and Doo-Hwan Bae. Dynamic profiling-based approach to identifying cost-effective refactorings. Information and Software Technology, 55(6):966 – 985, 2013.
[13]
Mark Harman. Automated patching techniques: The fix is in: Technical perspective. Communications of the ACM, 53(5):108, 2010.
[14]
Mark Harman, William B. Langdon, Yue Jia, David Robert White, Andrea Arcuri, and John A. Clark. The GISMOE challenge: Constructing the pareto program surface using genetic programming to find better programs (keynote paper). In 27th IEEE/ACM International Conference on Automated Software Engineering (ASE 2012), pages 1–14, Essen, Germany, September 2012.
[15]
Mark Harman, William B. Langdon, and Westley Weimer. Genetic programming for reverse engineering (keynote paper). In Rocco Oliveto and Romain Robbes, editors, 20th Working Conference on Reverse Engineering (WCRE 2013), Koblenz, Germany, 14-17 October 2013. IEEE.
[16]
Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. On the naturalness of software. In Software Engineering (ICSE), 2012 34th International Conference on, pages 837–847. IEEE, 2012.
[17]
Guoliang Jin, Wei Zhang, Dongdong Deng, Ben Liblit, and Shan Lu. Automated concurrency-bug fixing. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI’12, pages 221–236, 2012.
[18]
Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. CCFinder: A multi-linguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering, 28(6):654–670, 2002.
[19]
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. Automatic patch generation learned from human-written patches. In 35th International Conference on Software Engineering (ICSE’13), pages 802–811. IEEE / ACM, 2013.
[20]
Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. Automatic patch generation learned from human-written patches. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 802–811, 2013.
[21]
William B. Langdon and Mark Harman. Evolving a CUDA kernel from an nVidia template. In IEEE Congress on Evolutionary Computation, pages 1–8. IEEE, 2010.
[22]
William B. Langdon and Mark Harman. Genetically improved CUDA C++ software. In 17th European Conference on Genetic Programming (EuroGP), Granada, Spain, April 2014. To Appear.
[23]
William B. Langdon and Mark Harman. Optimising existing software with genetic programming. IEEE Transactions on Evolutionary Computation, 2014. To appear.
[24]
Claire Le Goues, Stephanie Forrest, and Westley Weimer. Current challenges in automatic software repair. Software Quality Journal, 21(3):421–443, 2013.
[25]
Matias Martinez, Westley Weimer, and Martin Monperrus. Do the fix ingredients already exist? An empirical inquiry into the redundancy assumptions of program repair approaches. In Companion Proceedings of the 36th International Conference on Software Engineering, ICSE Companion 2014, pages 492– 495, New York, NY, USA, 2014. ACM.
[26]
Na Meng, Miryung Kim, and Kathryn S. McKinley. LASE: locating and applying systematic edits by learning from examples. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 502–511, 2013.
[27]
Eugene W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1:251–266, 1986.
[28]
Hoan Anh Nguyen, Anh Tuan Nguyen, Tung Thanh Nguyen, T.N. Nguyen, and H. Rajan. A study of repetitiveness of code changes in software evolution. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on, pages 180–190, Nov 2013.
[29]
Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. SemFix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 772–781, San Francisco, CA, USA, 2013. IEEE Press.
[30]
Michael Orlov and Moshe Sipper. Flight of the FINCH through the java wilderness. IEEE Transactions Evolutionary Computation, 15(2):166–182, 2011.
[31]
Jeff H. Perkins, Sunghun Kim, Sam Larsen, Saman Amarasinghe, Jonathan Bachrach, Michael Carbin, Carlos Pacheco, Frank Sherwood, Stelios Sidiroglou, Greg Sullivan, Weng-Fai Wong, Yoav Zibin, Michael D. Ernst, and Martin Rinard. Automatically patching errors in deployed software. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles, pages 87–102, Big Sky, MT, USA, October 12–14, 2009.
[32]
Justyna Petke, Mark Harman, William B. Langdon, and Westley Weimer. Using genetic improvement & code transplants to specialise a C++ program to a problem class. In 17th European Conference on Genetic Programming (EuroGP), Granada, Spain, April 2014. To Appear.
[33]
Baishakhi Ray and Miryung Kim. A case study of cross-system porting in forked projects. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE ’12, pages 53:1–53:11, New York, NY, USA, 2012. ACM.
[34]
Pitchaya Sitthi-amorn, Nicholas Modly, Westley Weimer, and Jason Lawrence. Genetic programming for shader simplification. ACM Trans. Graph, 30(6):152:1–152:11, 2011.
[35]
Sooel Son, Kathryn S. Mckinley, and Vitaly Shmatikov. Fix me up: Repairing access-control bugs in web applications. In In Network and Distributed System Security Symposium, 2013.
[36]
Nikolaos Tsantalis and Alexander Chatzigeorgiou. Identification of move method refactoring opportunities. IEEE Trans. Softw. Eng., 35(3):347–367, May 2009.
[37]
András Vargha and Harold D. Delaney. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25(2):101–132, 2000.
[38]
Tiantian Wang, Mark Harman, Yue Jia, and Jens Krinke. Searching for better configurations: a rigorous approach to clone evaluation. In European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE’13, pages 455–465, Saint Petersburg, Russian Federation, August 2013. ACM.
[39]
Yi Wei, Yu Pei, Carlo A. Furia, Lucas Serpa Silva, Stefan Buchholz, Bertrand Meyer, and Andreas Zeller. Automated fixing of programs with contracts. In Proceedings of the 19th International Symposium on Software Testing and Analysis, pages 61–72, 2010.
[40]
Westley Weimer. Patches as better bug reports. In Generative Programming and Component Engineering, pages 181–190, 2006.
[41]
Westley Weimer, Thanh Vu Nguyen, Claire Le Goues, and Stephanie Forrest. Automatically finding patches using genetic programming. In International Conference on Software Engineering (ICSE), pages 364–374, Vancouver, Canada, 2009.
[42]
David Robert White, Andrea Arcuri, and John A. Clark. Evolutionary improvement of programs. IEEE Transactions on Evolutionary Computation, 15(4):515–538, 2011.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering
November 2014
856 pages
ISBN:9781450330565
DOI:10.1145/2635868
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Software graftability
  2. automated program repair
  3. code reuse
  4. empirical software engineering
  5. mining software repositories

Qualifiers

  • Research-article

Conference

SIGSOFT/FSE'14
Sponsor:

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)72
  • Downloads (Last 6 weeks)9
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media