research-article

The plastic surgery hypothesis

Authors:

Premkumar Devanbu,

Federica SarroAuthors Info & Claims

FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

Pages 306 - 317

https://doi.org/10.1145/2635868.2635898

Published: 11 November 2014 Publication History

Abstract

Recent work on genetic-programming-based approaches to automatic program patching have relied on the insight that the content of new code can often be assembled out of fragments of code that already exist in the code base. This insight has been dubbed the plastic surgery hypothesis; successful, well-known automatic repair tools such as GenProg rest on this hypothesis, but it has never been validated. We formalize and validate the plastic surgery hypothesis and empirically measure the extent to which raw material for changes actually already exists in projects. In this paper, we mount a large-scale study of several large Java projects, and examine a history of 15,723 commits to determine the extent to which these commits are graftable, i.e., can be reconstituted from existing code, and find an encouraging degree of graftability, surprisingly independent of commit size and type of commit. For example, we find that changes are 43% graftable from the exact version of the software being changed. With a view to investigating the difficulty of finding these grafts, we study the abundance of such grafts in three possible sources: the immediately previous version, prior history, and other projects. We also examine the contiguity or chunking of these grafts, and the degree to which grafts can be found in the same file. Our results are quite promising and suggest an optimistic future for automatic program patching methods that search for raw material in already extant code in the project being patched.

References

[1]

Andrea Arcuri, David Robert White, John A. Clark, and Xin Yao. Multi-objective improvement of software using coevolution and smart seeding. In 7th International Conference on Simulated Evolution and Learning (SEAL 2008), pages 61–70, Melbourne, Australia, December 2008. Springer.

Digital Library

[2]

Andrea Arcuri and Xin Yao. A novel co-evolutionary approach to automatic software bug fixing. In Proceedings of the IEEE Congress on Evolutionary Computation (CEC’08), pages 162– 168, Hongkong, China, June 2008.

[3]

Brenda S Baker. A program for identifying duplicated code. In Computer Science and Statistics 24: Proceedings of the 24th Symposium on the Interface, pages 49–49, 1993.

[4]

Ira D. Baxter, Andrew Yahin, Leonardo Mendonça de Moura, Marcelo Sant’Anna, and Lorraine Bier. Clone detection using abstract syntax trees. In International Conference on Software Maintenance (ICSE’98), pages 368–377, 1998.

Digital Library

[5]

Stefan Bellon, Rainer Koschke, Giuliano Antoniol, Jens Krinke, and Ettore Merlo. Comparison and evaluation of clone detection tools. IEEE Transactions on Software Engineering, 33(9):577–591, 2007.

Digital Library

[6]

Yoav Benjamini and Yosef Hochberg. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B (Methodological), 57(1):289–300, 1995.

[7]

Yuriy Brun, Earl Barr, Ming Xiao, Claire Le Goues, and Prem Devanbu. Evolution vs. intelligent design in program patching. Technical Report https://escholarship.org/ uc/item/3z8926ks, UC Davis: College of Engineering, 2013.

[8]

S Carter, R. Frank, and D.S.W. Tansley. Clone detection in telecommunications software systems: A neural net approach. In Proc. Int. Workshop on Application of Neural Networks to Telecommunications, pages 273–287, 1993.

[9]

Satish Chandra, Emina Torlak, Shaon Barman, and Rastislav Bodik. Angelic debugging. In Proceedings of the 33rd International Conference on Software Engineering, ICSE ’11, pages 121–130, Honolulu, HI, USA, 2011. ACM.

Digital Library

[10]

Marios Fokaefs, Nikolaos Tsantalis, Eleni Stroulia, and Alexander Chatzigeorgiou. Identification and application of extract class refactorings in object-oriented systems. Journal of Systems and Software, 85(10):2241 – 2260, 2012.

Digital Library

[11]

Mark Gabel and Zhendong Su. A study of the uniqueness of source code. In Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering, FSE ’10, pages 147–156. ACM, 2010.

Digital Library

[12]

Ah-Rim Han and Doo-Hwan Bae. Dynamic profiling-based approach to identifying cost-effective refactorings. Information and Software Technology, 55(6):966 – 985, 2013.

Digital Library

[13]

Mark Harman. Automated patching techniques: The fix is in: Technical perspective. Communications of the ACM, 53(5):108, 2010.

Digital Library

[14]

Mark Harman, William B. Langdon, Yue Jia, David Robert White, Andrea Arcuri, and John A. Clark. The GISMOE challenge: Constructing the pareto program surface using genetic programming to find better programs (keynote paper). In 27th IEEE/ACM International Conference on Automated Software Engineering (ASE 2012), pages 1–14, Essen, Germany, September 2012.

Digital Library

[15]

Mark Harman, William B. Langdon, and Westley Weimer. Genetic programming for reverse engineering (keynote paper). In Rocco Oliveto and Romain Robbes, editors, 20th Working Conference on Reverse Engineering (WCRE 2013), Koblenz, Germany, 14-17 October 2013. IEEE.

[16]

Abram Hindle, Earl T Barr, Zhendong Su, Mark Gabel, and Premkumar Devanbu. On the naturalness of software. In Software Engineering (ICSE), 2012 34th International Conference on, pages 837–847. IEEE, 2012.

Digital Library

[17]

Guoliang Jin, Wei Zhang, Dongdong Deng, Ben Liblit, and Shan Lu. Automated concurrency-bug fixing. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation, OSDI’12, pages 221–236, 2012.

Digital Library

[18]

Toshihiro Kamiya, Shinji Kusumoto, and Katsuro Inoue. CCFinder: A multi-linguistic token-based code clone detection system for large scale source code. IEEE Transactions on Software Engineering, 28(6):654–670, 2002.

Digital Library

[19]

Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. Automatic patch generation learned from human-written patches. In 35th International Conference on Software Engineering (ICSE’13), pages 802–811. IEEE / ACM, 2013.

Digital Library

[20]

Dongsun Kim, Jaechang Nam, Jaewoo Song, and Sunghun Kim. Automatic patch generation learned from human-written patches. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 802–811, 2013.

Digital Library

[21]

William B. Langdon and Mark Harman. Evolving a CUDA kernel from an nVidia template. In IEEE Congress on Evolutionary Computation, pages 1–8. IEEE, 2010.

[22]

William B. Langdon and Mark Harman. Genetically improved CUDA C++ software. In 17th European Conference on Genetic Programming (EuroGP), Granada, Spain, April 2014. To Appear.

[23]

William B. Langdon and Mark Harman. Optimising existing software with genetic programming. IEEE Transactions on Evolutionary Computation, 2014. To appear.

[24]

Claire Le Goues, Stephanie Forrest, and Westley Weimer. Current challenges in automatic software repair. Software Quality Journal, 21(3):421–443, 2013.

Digital Library

[25]

Matias Martinez, Westley Weimer, and Martin Monperrus. Do the fix ingredients already exist? An empirical inquiry into the redundancy assumptions of program repair approaches. In Companion Proceedings of the 36th International Conference on Software Engineering, ICSE Companion 2014, pages 492– 495, New York, NY, USA, 2014. ACM.

Digital Library

[26]

Na Meng, Miryung Kim, and Kathryn S. McKinley. LASE: locating and applying systematic edits by learning from examples. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 502–511, 2013.

Digital Library

[27]

Eugene W. Myers. An O(ND) difference algorithm and its variations. Algorithmica, 1:251–266, 1986.

[28]

Hoan Anh Nguyen, Anh Tuan Nguyen, Tung Thanh Nguyen, T.N. Nguyen, and H. Rajan. A study of repetitiveness of code changes in software evolution. In Automated Software Engineering (ASE), 2013 IEEE/ACM 28th International Conference on, pages 180–190, Nov 2013.

Digital Library

[29]

Hoang Duong Thien Nguyen, Dawei Qi, Abhik Roychoudhury, and Satish Chandra. SemFix: Program repair via semantic analysis. In Proceedings of the 2013 International Conference on Software Engineering, ICSE ’13, pages 772–781, San Francisco, CA, USA, 2013. IEEE Press.

Digital Library

[30]

Michael Orlov and Moshe Sipper. Flight of the FINCH through the java wilderness. IEEE Transactions Evolutionary Computation, 15(2):166–182, 2011.

Digital Library

[31]

Jeff H. Perkins, Sunghun Kim, Sam Larsen, Saman Amarasinghe, Jonathan Bachrach, Michael Carbin, Carlos Pacheco, Frank Sherwood, Stelios Sidiroglou, Greg Sullivan, Weng-Fai Wong, Yoav Zibin, Michael D. Ernst, and Martin Rinard. Automatically patching errors in deployed software. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles, pages 87–102, Big Sky, MT, USA, October 12–14, 2009.

Digital Library

[32]

Justyna Petke, Mark Harman, William B. Langdon, and Westley Weimer. Using genetic improvement & code transplants to specialise a C++ program to a problem class. In 17th European Conference on Genetic Programming (EuroGP), Granada, Spain, April 2014. To Appear.

Digital Library

[33]

Baishakhi Ray and Miryung Kim. A case study of cross-system porting in forked projects. In Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, FSE ’12, pages 53:1–53:11, New York, NY, USA, 2012. ACM.

Digital Library

[34]

Pitchaya Sitthi-amorn, Nicholas Modly, Westley Weimer, and Jason Lawrence. Genetic programming for shader simplification. ACM Trans. Graph, 30(6):152:1–152:11, 2011.

Digital Library

[35]

Sooel Son, Kathryn S. Mckinley, and Vitaly Shmatikov. Fix me up: Repairing access-control bugs in web applications. In In Network and Distributed System Security Symposium, 2013.

[36]

Nikolaos Tsantalis and Alexander Chatzigeorgiou. Identification of move method refactoring opportunities. IEEE Trans. Softw. Eng., 35(3):347–367, May 2009.

Digital Library

[37]

András Vargha and Harold D. Delaney. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics, 25(2):101–132, 2000.

[38]

Tiantian Wang, Mark Harman, Yue Jia, and Jens Krinke. Searching for better configurations: a rigorous approach to clone evaluation. In European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE’13, pages 455–465, Saint Petersburg, Russian Federation, August 2013. ACM.

Digital Library

[39]

Yi Wei, Yu Pei, Carlo A. Furia, Lucas Serpa Silva, Stefan Buchholz, Bertrand Meyer, and Andreas Zeller. Automated fixing of programs with contracts. In Proceedings of the 19th International Symposium on Software Testing and Analysis, pages 61–72, 2010.

Digital Library

[40]

Westley Weimer. Patches as better bug reports. In Generative Programming and Component Engineering, pages 181–190, 2006.

Digital Library

[41]

Westley Weimer, Thanh Vu Nguyen, Claire Le Goues, and Stephanie Forrest. Automatically finding patches using genetic programming. In International Conference on Software Engineering (ICSE), pages 364–374, Vancouver, Canada, 2009.

Digital Library

[42]

David Robert White, Andrea Arcuri, and John A. Clark. Evolutionary improvement of programs. IEEE Transactions on Evolutionary Computation, 15(4):515–538, 2011.

Digital Library

Cited By

Ding YFilkov VRay BZhou M(2024)Semantic-aware Source Code ModelingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695605(2494-2497)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695605
Zhao JYang DZhang LLian XYang ZLiu FFilkov VRay BZhou M(2024)Enhancing Automated Program Repair with Solution DesignProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695537(1706-1718)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695537
Chueca JBlasco DCetina CFont J(2024)Leveraging Phylogenetics in Software Product Families: The Case of Latent Content Generation in Video GamesProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672596(113-124)Online publication date: 2-Sep-2024
https://dl.acm.org/doi/10.1145/3646548.3672596
Show More Cited By

Index Terms

The plastic surgery hypothesis
1. Social and professional topics
  1. Professional topics
    1. Management of computing and information systems
      1. Software management
        Software maintenance
2. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Reusability
    2. Software post-development issues
  2. Software notations and tools
    1. Software libraries and repositories

Recommendations

Studying the fix-time for bugs in large open source projects
Promise '11: Proceedings of the 7th International Conference on Predictive Models in Software Engineering

Background: Bug fixing lies at the core of most software maintenance efforts. Most prior studies examine the effort needed to fix a bug (fix-effort). However, the effort needed to fix a bug may not correlate with the calendar time needed to fix it (fix-...
Better code search and reuse for better program repair
GI '19: Proceedings of the 6th International Workshop on Genetic Improvement

A branch of automated program repair (APR) techniques look at finding and reusing existing code for bug repair. ssFix is one of such techniques that is syntactic search-based: it searches a code database for code fragments that are syntactically similar ...
An Exploratory Study of the Impact of Code Smells on Software Change-proneness
WCRE '09: Proceedings of the 2009 16th Working Conference on Reverse Engineering

Code smells are poor implementation choices, thought to make object-oriented systems hard to maintain. In this study, we investigate if classes with code smells are more change-prone than classes without smells. Specifically, we test the general ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

FSE 2014: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering

November 2014

856 pages

ISBN:9781450330565

DOI:10.1145/2635868

General Chair:
Shing-Chi Cheung
Hong Kong University of Science and Technology, China
,
Program Chairs:
Alessandro Orso
Georgia Institute of Technology, USA
,
Margaret-Anne Storey
University of Victoria, Canada

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 November 2014

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGSOFT/FSE'14

Sponsor:

SIGSOFT

SIGSOFT/FSE'14: 22nd ACM SIGSOFT Symposium on the Foundations of Software Engineering

November 16 - 21, 2014

Hong Kong, China

Acceptance Rates

Overall Acceptance Rate 17 of 128 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

138
Total Citations
View Citations
811
Total Downloads

Downloads (Last 12 months)72
Downloads (Last 6 weeks)9

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ding YFilkov VRay BZhou M(2024)Semantic-aware Source Code ModelingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695605(2494-2497)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695605
Zhao JYang DZhang LLian XYang ZLiu FFilkov VRay BZhou M(2024)Enhancing Automated Program Repair with Solution DesignProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695537(1706-1718)Online publication date: 27-Oct-2024
https://dl.acm.org/doi/10.1145/3691620.3695537
Chueca JBlasco DCetina CFont J(2024)Leveraging Phylogenetics in Software Product Families: The Case of Latent Content Generation in Video GamesProceedings of the 28th ACM International Systems and Software Product Line Conference10.1145/3646548.3672596(113-124)Online publication date: 2-Sep-2024
https://dl.acm.org/doi/10.1145/3646548.3672596
Yuan YHuyen PTan SMechtaev SKhurshid S(2024)ARJA-e for the First International Competition on Automated Program RepairProceedings of the 5th ACM/IEEE International Workshop on Automated Program Repair10.1145/3643788.3648019(50-52)Online publication date: 20-Apr-2024
https://dl.acm.org/doi/10.1145/3643788.3648019
Dilhara MBellur ABryksin TDig D(2024)Unprecedented Code Change Automation: The Fusion of LLMs and Transformation by ExampleProceedings of the ACM on Software Engineering10.1145/36437551:FSE(631-653)Online publication date: 12-Jul-2024
https://dl.acm.org/doi/10.1145/3643755
Liu CCetin PPatodia YRay BChakraborty SDing Y(2024)Automated Code Editing With Search-Generate-ModifyIEEE Transactions on Software Engineering10.1109/TSE.2024.337638750:7(1675-1686)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1109/TSE.2024.3376387
Ruan HNguyen HShariffdeen RNoller YRoychoudhury A(2024)Evolutionary Testing for Program Repair2024 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST60714.2024.00058(105-116)Online publication date: 27-May-2024
https://doi.org/10.1109/ICST60714.2024.00058
Maes-Bermejo MSerebrenik AGallego MGortázar FRobles GGonzález Barahona J(2024)Hunting bugs: Towards an automated approach to identifying which change caused a bug through regression testingEmpirical Software Engineering10.1007/s10664-024-10479-z29:3Online publication date: 4-May-2024
https://doi.org/10.1007/s10664-024-10479-z
Ding YWang ZAhmad WDing HTan MJain NRamanathan MNallapati RBhatia PRoth DXiang BOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)CROSSCODEEVALProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668145(46701-46723)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3668145
Cao HHan DLiu FLiao TZhao CShi J(2023)Code Similarity and Location-Awareness Automatic Program RepairApplied Sciences10.3390/app1314851913:14(8519)Online publication date: 23-Jul-2023
https://doi.org/10.3390/app13148519
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents