skip to main content
10.1109/ICSE48619.2023.00129acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Automated Program Repair in the Era of Large Pre-Trained Language Models

Published: 26 July 2023 Publication History

Abstract

Automated Program Repair (APR) aims to help developers automatically patch software bugs. However, current state-of-the-art traditional and learning-based APR techniques face the problem of limited patch variety, failing to fix complicated bugs. This is mainly due to the reliance on bug-fixing datasets to craft fix templates (traditional) or directly predict potential patches (learning-based). Large Pre-Trained Language Models (LLMs), trained using billions of text/code tokens, can potentially help avoid this issue. Very recently, researchers have directly leveraged LLMs for APR without relying on any bug-fixing datasets. Meanwhile, such existing work either failed to include state-of-the-art LLMs or was not evaluated on realistic datasets. Thus, the true power of modern LLMs on the important APR problem is yet to be revealed.
In this work, we perform the first extensive study on directly applying LLMs for APR. We select 9 recent state-of-the-art LLMs, including both generative and infilling models, ranging from 125M to 20B in size. We designed 3 different repair settings to evaluate the different ways we can use LLMs to generate patches: 1) generate the entire patch function, 2) fill in a chunk of code given the prefix and suffix 3) output a single line fix. We apply the LLMs under these repair settings on 5 datasets across 3 different languages and compare different LLMs in the number of bugs fixed, generation speed and compilation rate. We also compare the LLMs against recent state-of-the-art APR tools. Our study demonstrates that directly applying state-of-the-art LLMs can already substantially outperform all existing APR techniques on all our datasets. Among the studied LLMs, the scaling effect exists for APR where larger models tend to achieve better performance. Also, we show for the first time that suffix code after the buggy line (adopted in infilling-style APR) is important in not only generating more fixes but more patches with higher compilation rate. Besides patch generation, the LLMs consider correct patches to be more natural than other ones, and can even be leveraged for effective patch ranking or patch correctness checking. Lastly, we show that LLM-based APR can be further substantially boosted via: 1) increasing the sample size, and 2) incorporating fix template information.

References

[1]
K. Luzniak, "Software for the healthcare industry: what is it and why it's worth using?" neoteric, 2022, https://neoteric.eu/blog/software-for-the-healthcare-industry-what-is-it-and-why-its-worth-using.
[2]
N. Mayersohn, "Data driving new approaches to transportation," The New York Times, 2022, https://www.nytimes.com/2020/02/05/technology/data-micromobility-electric-scooters-mds.html.
[3]
E. Richards, "Software's dangerous aspect," The Washington Post, 1990, https://www.washingtonpost.com/archive/politics/1990/12/09/softwares-dangerous-aspect/9b2e9243-8deb-4ac7-9e8f-968de0806e5e/.
[4]
S. Matteson, "Report: Software failure caused $1.7 trillion in financial losses in 2017," TechRepublic, 2018, https://www.techrepublic.com/article/report-software-failure-caused-1-7-trillion-in-financial-losses-in-2017/.
[5]
D. H. O'Dell, "The debugging mindset," acmqueue, 2017, https://queue.acm.org/detail.cfm?id=3068754/.
[6]
L. Gazzola, D. Micucci, and L. Mariani, "Automatic software repair: A survey," IEEE Transactions on Software Engineering, vol. 45, 2019.
[7]
C. Le Goues, T. Nguyen, S. Forrest, and W. Weimer, "Genprog: A generic method for automatic software repair," IEEE Transactions on Software Engineering, vol. 38, 2012.
[8]
X. B. D. Le, D. Lo, and C. Le Goues, "History driven program repair," in SANER, 2016.
[9]
M. Wen, J. Chen, R. Wu, D. Hao, and S.-C. Cheung, "Context-aware patch generation for better automated program repair," in ICSE, 2018.
[10]
S. Mechtaev, J. Yi, and A. Roychoudhury, "Angelix: Scalable multiline program patch synthesis via symbolic analysis," in ICSE, 2016.
[11]
X.-B. D. Le, D.-H. Chu, D. Lo, C. Le Goues, and W. Visser, "S3: syntax- and semantic-guided repair synthesis via programming by examples," in ESEC/FSE, 2017.
[12]
F. DeMarco, J. Xuan, D. Le Berre, and M. Monperrus, "Automatic repair of buggy if conditions and missing preconditions with smt," in Proceedings of the 6th International Workshop on Constraints in Software Testing, Verification, and Analysis, 2014.
[13]
J. Hua, M. Zhang, K. Wang, and S. Khurshid, "Sketchfix: A tool for automated program repair approach using lazy candidate generation," in ESEC/FSE, 2018.
[14]
M. Martinez and M. Monperrus, "Astor: A program repair library for java (demo)," in ISSTA, 2016.
[15]
A. Koyuncu, K. Liu, T. F. Bissyandé, D. Kim, J. Klein, M. Monperrus, and Y. L. Traon, "Fixminer: Mining relevant fix patterns for automated program repair," Empir. Softw. Eng., vol. 25, 2020.
[16]
K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyandé, "AVATAR: fixing semantic bugs with fix patterns of static analysis violations," in Proceedings of the 26th IEEE International Conference on Software Analysis, Evolution, and Reengineering, 2019.
[17]
Y. Lou, A. Ghanbari, X. Li, L. Zhang, H. Zhang, D. Hao, and L. Zhang, "Can automated program repair refine fault localization? a unified debugging approach," in ISSTA, 2020.
[18]
S. Benton, X. Li, Y. Lou, and L. Zhang, "On the effectiveness of unified debugging: An extensive study on 16 program repair systems," in ASE, 2020.
[19]
K. Liu, A. Koyuncu, D. Kim, and T. F. Bissyandé, "Tbar: Revisiting template-based automated program repair," in ISSTA, 2019.
[20]
A. Ghanbari, S. Benton, and L. Zhang, "Practical program repair via bytecode mutation," in ISSTA, 2019.
[21]
Q. Zhu, Z. Sun, Y.-a. Xiao, W. Zhang, K. Yuan, Y. Xiong, and L. Zhang, "A syntax-guided edit decoder for neural program repair," in ESEC/FSE, 2021.
[22]
N. Jiang, T. Lutellier, and L. Tan, "Cure: Code-aware neural machine translation for automatic program repair," ICSE, 2021.
[23]
H. Ye, M. Martinez, and M. Monperrus, "Neural program repair with execution-based backpropagation," in ICSE, 2022.
[24]
T. Lutellier, H. V. Pham, L. Pang, Y. Li, M. Wei, and L. Tan, "Coconut: Combining context-aware neural translation models using ensemble for program repair," in ISSTA, 2020.
[25]
I. Sutskever, O. Vinyals, and Q. V. Le, "Sequence to sequence learning with neural networks," 2014, arXiv:1409.3215.
[26]
C. S. Xia and L. Zhang, "Less training, more repairing please: Revisiting automated program repair via zero-shot learning," in ESEC/FSE, 2022.
[27]
T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, S. Agarwal, A. Herbert-Voss, G. Krueger, T. Henighan, R. Child, A. Ramesh, D. M. Ziegler, J. Wu, C. Winter, C. Hesse, M. Chen, E. Sigler, M. Litwin, S. Gray, B. Chess, J. Clark, C. Berner, S. McCandlish, A. Radford, I. Sutskever, and D. Amodei, "Language models are few-shot learners," 2020, arXiv:2005.14165.
[28]
M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavarian, C. Winter, P. Tillet, F. P. Such, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert-Voss, W. H. Guss, A. Nichol, A. Paino, N. Tezak, J. Tang, I. Babuschkin, S. Balaji, S. Jain, W. Saunders, C. Hesse, A. N. Carr, J. Leike, J. Achiam, V. Misra, E. Morikawa, A. Radford, M. Knight, M. Brundage, M. Murati, K. Mayer, P. Welinder, B. McGrew, D. Amodei, S. McCandlish, I. Sutskever, and W. Zaremba, "Evaluating large language models trained on code," 2021, arXiv:2107.03374.
[29]
J. Schulman, B. Zoph, J. H. Christina Kim, J. Menick, J. Weng, J. F. C. Uribe, L. Fedus, L. Metz, M. Pokorny, R. G. Lopes, S. Zhao, A. Vijayvergiya, E. Sigler, A. Perelman, C. Voss, M. Heaton, J. Parish, D. Cummings, R. Nayak, V. Balcom, D. Schnurr, T. Kaftan, C. Hallacy, N. Turley, N. Deutsch, V. Goel, J. Ward, A. Konstantinidis, W. Zaremba, L. Ouyang, L. Bogdonoff, J. Gross, D. Medina, S. Yoo, T. Lee, R. Lowe, D. Mossing, J. Huizinga, R. Jiang, C. Wainwright, D. Almeida, S. Lin, M. Zhang, K. Xiao, K. Slama, S. Bills, A. Gray, J. Leike, J. Pachocki, P. Tillet, S. Jain, G. Brockman, and N. Ryder, "Chatgpt: Optimizing language models for dialogue," 2022, https://openai.com/blog/chatgpt/.
[30]
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou, "Codebert: A pre-trained model for programming and natural languages," 2020, arXiv:2002.08155.
[31]
S. D. Kolak, R. Martins, C. L. Goues, and V. J. Hellendoorn, "Patch generation with language models: Feasibility and scaling behavior," in Deep Learning for Code Workshop, 2022.
[32]
J. A. Prenner, H. Babii, and R. Robbes, "Can openai's codex fix bugs?: An evaluation on quixbugs," in 2022 IEEE/ACM International Workshop on Automated Program Repair (APR), 2022.
[33]
D. Fried, A. Aghajanyan, J. Lin, S. Wang, E. Wallace, F. Shi, R. Zhong, W.-t. Yih, L. Zettlemoyer, and M. Lewis, "Incoder: A generative model for code infilling and synthesis," 2022, arXiv:2204.05999.
[34]
S. Lu, D. Guo, S. Ren, J. Huang, A. Svyatkovskiy, A. Blanco, C. Clement, D. Drain, D. Jiang, D. Tang et al., "Codexglue: A machine learning benchmark dataset for code understanding and generation," 2021, arXiv:2102.04664.
[35]
K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, "Bleu: A method for automatic evaluation of machine translation," in Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, 2002.
[36]
Y. Liu, "Fine-tune bert for extractive summarization," 2019, arXiv:1903.10318.
[37]
Z. Yang, Z. Dai, Y. Yang, J. Carbonell, R. Salakhutdinov, and Q. V. Le, "Xlnet: Generalized autoregressive pretraining for language understanding," 2020, arXiv:1906.08237.
[38]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, "Attention is all you need," 2017, arXiv:1706.03762.
[39]
L. Reynolds and K. McDonell, "Prompt programming for large language models: Beyond the few-shot paradigm," 2021, arXiv:2102.07350.
[40]
J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei, "Scaling laws for neural language models," 2020, arXiv:2001.08361.
[41]
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," 2018, arXiv:1810.04805.
[42]
S. Black, L. Gao, P. Wang, C. Leahy, and S. Biderman, "GPT-Neo: Large Scale Autoregressive Language Modeling with Mesh-Tensorflow," Mar. 2021. [Online].
[43]
C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and P. J. Liu, "Exploring the limits of transfer learning with a unified text-to-text transformer," J. Mach. Learn. Res., 2020.
[44]
M. Lewis, Y. Liu, N. Goyal, M. Ghazvininejad, A. Mohamed, O. Levy, V. Stoyanov, and L. Zettlemoyer, "Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension," 2019, arXiv:1910.13461.
[45]
A. Aghajanyan, B. Huang, C. Ross, V. Karpukhin, H. Xu, N. Goyal, D. Okhonko, M. Joshi, G. Ghosh, M. Lewis, and L. Zettlemoyer, "Cm3: A causal masked multimodal model of the internet," 2022, arXiv:2201.07520.
[46]
R. Abreu, P. Zoeteweij, and A. J. van Gemund, "On the accuracy of spectrum-based fault localization," in Testing: Academic and Industrial Conference Practice and Research Techniques - MUTATION (TAICPART-MUTATION 2007), 2007.
[47]
A. Hindle, E. T. Barr, Z. Su, M. Gabel, and P. Devanbu, "On the naturalness of software," in ICSE, 2012.
[48]
"Hugging face," 2022, https://huggingface.co.
[49]
B. Wang and A. Komatsuzaki, "GPT-J-6B: A 6 Billion Parameter Autoregressive Language Model," https://github.com/kingoflolz/mesh-transformer-jax, May 2021.
[50]
S. Black, S. Biderman, E. Hallahan, Q. Anthony, L. Gao, L. Golding, H. He, C. Leahy, K. McDonell, J. Phang, M. Pieler, U. S. Prashanth, S. Purohit, L. Reynolds, J. Tow, B. Wang, and S. Weinbach, "GPT-NeoX-20B: An open-source autoregressive language model," in Proceedings of the ACL Workshop on Challenges & Perspectives in Creating Large Language Models, 2022, arXiv:2204.06745.
[51]
L. Gao, S. Biderman, S. Black, L. Golding, T. Hoppe, C. Foster, J. Phang, H. He, A. Thite, N. Nabeshima et al., "The pile: An 800gb dataset of diverse text for language modeling," 2020, arXiv:2101.00027.
[52]
S. J. Yue Wang, Weishi Wang and S. C. Hoi, "Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation," in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, EMNLP 2021, 2021.
[53]
H. Husain, H.-H. Wu, T. Gazit, M. Allamanis, and M. Brockschmidt, "Codesearchnet challenge: Evaluating the state of semantic code search," 2020, arXiv:1909.09436.
[54]
"Bigquery github repos," 2022, https://console.cloud.google.com/marketplace/details/github/github-repos.
[55]
"Codex suffix api," https://beta.openai.com/docs/api-reference/completions/create#completions/create-suffix, 2022.
[56]
L. Zhang, L. Zhang, and S. Khurshid, "Injecting mechanical faults to localize developer faults for evolving software," ACM SIGPLAN Notices, vol. 48, 2013.
[57]
A. Holtzman, J. Buys, L. Du, M. Forbes, and Y. Choi, "The curious case of neural text degeneration," 2019, arXiv:1904.09751.
[58]
Y. Wang, J. Yang, Y. Lou, M. Wen, and L. Zhang, "Attention: Not just another dataset for patch-correctness checking," 2022, arXiv:2207.06590.
[59]
"Pytorch," 2022, http://pytorch.org.
[60]
"Openai api," 2022, https://openai.com/api.
[61]
Y. Li, D. Choi, J. Chung, N. Kushman, J. Schrittwieser, R. Leblond, T. Eccles, J. Keeling, F. Gimeno, A. D. Lago, T. Hubert, P. Choy, C. d. M. d'Autume, I. Babuschkin, X. Chen, P.-S. Huang, J. Welbl, S. Gowal, A. Cherepanov, J. Molloy, D. J. Mankowitz, E. S. Robson, P. Kohli, N. de Freitas, K. Kavukcuoglu, and O. Vinyals, "Competition-level code generation with alphacode," 2022, arXiv:2203.07814.
[62]
R. Just, D. Jalali, and M. D. Ernst, "Defects4j: A database of existing faults to enable controlled testing studies for java programs," ser. ISSTA, 2014.
[63]
D. Lin, J. Koppel, A. Chen, and A. Solar-Lezama, "Quixbugs: A multilingual program repair benchmark set based on the quixey challenge," ser. SPLASH Companion 2017, 2017.
[64]
C. Le Goues, N. Holtschulte, E. K. Smith, Y. Brun, P. Devanbu, S. Forrest, and W. Weimer, "The manybugs and introclass benchmarks for automated repair of c programs," IEEE Transactions on Software Engineering, vol. 41, 2015.
[65]
D. Drain, C. B. Clement, G. Serrato, and N. Sundaresan, "Deepdebug: Fixing python bugs using stack traces, backtranslation, and code skeletons," 2021, arXiv:2105.09352.
[66]
Y. Li, S. Wang, and T. N. Nguyen, "Dlfix: Context-based code transformation learning for automated program repair," in ICSE, 2020.
[67]
Z. Chen, S. Kommrusch, M. Tufano, L.-N. Pouchet, D. Poshyvanyk, and M. Monperrus, "Sequencer: Sequence-to-sequence learning for end-to-end program repair," IEEE Transaction on Software Engineering, 2019.
[68]
J. Jiang, Y. Xiong, H. Zhang, Q. Gao, and X. Chen, "Shaping program repair space with existing patches and similar code," in ISSTA, 2018.
[69]
L. Chen, Y. Pei, and C. A. Furia, "Contract-based program repair without the contracts," in ASE, 2017.
[70]
M. Martinez, T. Durieux, J. Xuan, R. Sommerard, and M. Monperrus, "Automatic repair of real bugs: An experience report on the defects4j dataset," 2015, arXiv:1505.07002.
[71]
M. Tufano, C. Watson, G. Bavota, M. Di Penta, M. White, and D. Poshyvanyk, "An empirical investigation into learning bug-fixing patches in the wild via neural machine translation," in ASE, 2018.
[72]
"jetbrick-template-2x object comparison code," 2022, https://github.com/subchen/jetbrick-template-2x/blob/def3107e2878aa5bee32ac2ba3be8e241fba4a64/src/main/java/jetbrick/template/parser/ast/ALU.java#L421-L448.
[73]
"goclipse object comparison code," 2022, https://github.com/GoClipse/goclipse/blob/e135d3a69e6498e278521c2542cee3808bd1377d/plugin_tooling/src-util/melnorme/utilbox/core/CoreUtil.java#L28-L30.
[74]
"teiid object comparison code," 2022, https://github.com/teiid/teiid/blob/21c93a6fd4be2528f95224f99905d74479862d1b/federate-common-core/src/main/java/com/metamatrix/core/util/EquivalenceUtil.java#L49-L57.
[75]
"Groza object comparison code," 2022, https://github.com/IoT-Technology/Groza/blob/fbafceef53d646025046990ffbd89bf701c56b45/dao/src/main/java/com/sanshengshui/server/dao/util/mapping/JsonTypeDescriptor.java#L49-L58.
[76]
M. Asad, K. K. Ganguly, and K. Sakib, "Impact analysis of syntactic and semantic similarities on patch prioritization in automated program repair," in ICSME, 2019.
[77]
Q. Xin and S. P. Reiss, "Leveraging syntax-related code for automated program repair," in ASE, 2017.
[78]
E. Sober, Ockham's razors. Cambridge University Press, 2015.
[79]
"Dataset," 2023, https://zenodo.org/record/7592886.
[80]
E. T. Barr, Y. Brun, P. Devanbu, M. Harman, and F. Sarro, "The plastic surgery hypothesis," in ESEC/FSE, 2014.
[81]
C. S. Xia and L. Zhang, "Conversational automated program repair," 2023, arXiv:2301.13246.
[82]
A. Zeller, R. Gopinath, M. Böhme, G. Fraser, and C. Holler, "The fuzzing book," 2019.
[83]
Z. Manna and R. J. Waldinger, "Toward automatic program synthesis," Commun. ACM, vol. 14, no. 3, p. 151--165, mar 1971.
[84]
Y. Deng, C. S. Xia, H. Peng, C. Yang, and L. Zhang, "Fuzzing deep-learning libraries via large language models," 2022, arXiv:2212.14834.
[85]
G. Fraser and A. Arcuri, "Whole test suite generation," IEEE Transactions on Software Engineering, vol. 39, 2012.
[86]
M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao, "The daikon system for dynamic detection of likely invariants," Science of computer programming, vol. 69, no. 1--3, pp. 35--45, 2007.
[87]
E. Dinella, G. Ryan, T. Mytkowicz, and S. K. Lahiri, "Toga: a neural method for test oracle generation," in ICSE, 2022.
[88]
C. Watson, M. Tufano, K. Moran, G. Bavota, and D. Poshyvanyk, "On learning meaningful assert statements for unit test cases," in ICSE, 2020.
[89]
Y. Jia and M. Harman, "An analysis and survey of the development of mutation testing," IEEE transactions on software engineering, vol. 37, 2010.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '23: Proceedings of the 45th International Conference on Software Engineering
May 2023
2713 pages
ISBN:9781665457019
  • General Chair:
  • John Grundy,
  • Program Co-chairs:
  • Lori Pollock,
  • Massimiliano Di Penta

Sponsors

In-Cooperation

  • IEEE CS

Publisher

IEEE Press

Publication History

Published: 26 July 2023

Check for updates

Qualifiers

  • Research-article

Conference

ICSE '23
Sponsor:
ICSE '23: 45th International Conference on Software Engineering
May 14 - 20, 2023
Victoria, Melbourne, Australia

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)159
  • Downloads (Last 6 weeks)12
Reflects downloads up to 31 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Systematic Literature Review on Automated Software Vulnerability Detection Using Machine LearningACM Computing Surveys10.1145/369971157:3(1-36)Online publication date: 11-Nov-2024
  • (2024)Evolving Paradigms in Automated Program Repair: Taxonomy, Challenges, and OpportunitiesACM Computing Surveys10.1145/369645057:2(1-43)Online publication date: 10-Oct-2024
  • (2024)Large Language Models for Software Engineering: A Systematic Literature ReviewACM Transactions on Software Engineering and Methodology10.1145/369598833:8(1-79)Online publication date: 20-Sep-2024
  • (2024)If At First You Don’t Succeed, Try, Try, Again...? Insights and LLM-informed Tooling for Detecting Retry Bugs in Software SystemsProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695971(63-78)Online publication date: 4-Nov-2024
  • (2024)VulAdvisor: Natural Language Suggestion Generation for Software Vulnerability RepairProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695555(1932-1944)Online publication date: 27-Oct-2024
  • (2024)Enhancing Automated Program Repair with Solution DesignProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695537(1706-1718)Online publication date: 27-Oct-2024
  • (2024)On the Evaluation of Large Language Models in Unit Test GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695529(1607-1619)Online publication date: 27-Oct-2024
  • (2024)Towards Understanding the Effectiveness of Large Language Models on Directed Test Input GenerationProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695513(1408-1420)Online publication date: 27-Oct-2024
  • (2024)Spotting Code Mutation for Predictive Mutation TestingProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695491(1133-1145)Online publication date: 27-Oct-2024
  • (2024)Attacks and Defenses for Large Language Models on Coding TasksProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3695297(2268-2272)Online publication date: 27-Oct-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media