research-article

Multi-language Software Development in the LLM Era: Insights from Practitioners’ Conversations with ChatGPT

Authors:

Matheus Paixao,

Matheus Freitas,

Eliakim GamaAuthors Info & Claims

ESEM '24: Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Pages 489 - 495

https://doi.org/10.1145/3674805.3690755

Published: 24 October 2024 Publication History

Abstract

Non-trivial software systems are commonly developed using more than a single programming language. However, multi-language development is not straightforward. Nowadays, tools powered by Large Language Models (LLMs), such as ChatGPT, have been shown to successfully assist practitioners in several aspects of software development. This paper reports a preliminary study aimed to investigate to what extent ChatGPT is being used in multi-language development scenarios. Hence, we leveraged DevGPT, a dataset of conversations between software practitioners and ChatGPT. In total, we studied data from 3,584 conversations, comprising a total of 18,862 code snippets. Our analyses show that only 18.33% of the code snippets suggested by ChatGPT are written in the same programming language as the primary language in the repository where the conversation was shared. In an in-depth analysis, we observed expected scenarios, such as 31.54% of JavaScript snippets being suggested in CSS repositories However, we also unveiled surprising ones, such as Python snippets being largely suggested in C++ repositories. After a qualitative open card sorting of the conversations, we found that in 70% of them developers were asking for coding support while in 57% developers used ChatGPT as a tool to generate code. Our initial results indicate that not only LLMs are being used in multi-language development but also showcase the contexts in which such tools are assisting developers.

References

[1]

Harold Abelson and Gerald Jay Sussman. 1996. Structure and interpretation of computer programs. The MIT Press.

Digital Library

[2]

Mouna Abidi, Manel Grichi, and Foutse Khomh. 2019. Behind the scenes: developers’ perception of multi-language practices. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering. 72–81.

[3]

Mouna Abidi, Manel Grichi, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2019. Code smells for multi-language systems. In Proceedings of the 24th European conference on pattern languages of programs. 1–13.

Digital Library

[4]

Mouna Abidi and Foutse Khomh. 2020. Towards the definition of patterns and code smells for multi-language systems. In Proceedings of the European Conference on Pattern Languages of Programs 2020. 1–13.

Digital Library

[5]

Mouna Abidi, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2019. Anti-patterns for multi-language systems. In Proceedings of the 24th European conference on pattern languages of programs. 1–14.

Digital Library

[6]

Mouna Abidi, Md Saidur Rahman, Moses Openja, and Foutse Khomh. 2021. Are multi-language design smells fault-prone? An empirical study. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 3 (2021), 1–56.

Digital Library

[7]

Lucas Aguiar, Matheus Paixao, Rafael Carmo, Matheus Freitas, Eliakim Gama, Antonio Leal, and Edson Soares. 2024. Replication package for the paper "Multi-language Software Development in the LLM Era: Insights from Practitioners’ Conversations with ChatGPT". https://doi.org/10.5281/zenodo.13710992

[8]

Tegawendé F Bissyandé, Ferdian Thung, David Lo, Lingxiao Jiang, and Laurent Réveillere. 2013. Popularity, interoperability, and impact of programming languages in 100,000 open source projects. In 2013 IEEE 37th annual computer software and applications conference. IEEE, 303–312.

[9]

Daniel P Delorey, Charles D Knutson, and Christophe Giraud-Carrier. 2007. Programming language trends in open source development: An evaluation using data from all production phase sourceforge projects. In Second International Workshop on Public Data about Software Development (WoPDaSD’07).

[10]

Manel Grichi, Mouna Abidi, Fehmi Jaafar, Ellis E Eghan, and Bram Adams. 2020. On the impact of interlanguage dependencies in multilanguage systems empirical case study on java native interface applications (JNI). IEEE Transactions on Reliability 70, 1 (2020), 428–440.

[11]

Kevin Jesse, Toufique Ahmed, Premkumar T Devanbu, and Emily Morgan. 2023. Large Language Models and Simple, Stupid Bugs. arXiv preprint arXiv:2303.11455 (2023).

[12]

T Capers Jones. 2007. Estimating software costs. McGraw-Hill, Inc.

[13]

Pavneet Singh Kochhar, Dinusha Wijedasa, and David Lo. 2016. A large scale study of multiple programming languages and code quality. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 563–573.

[14]

Wen Li, Na Meng, Li Li, and Haipeng Cai. 2021. Understanding language selection in multi-language software projects on github. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). IEEE, 256–257.

Digital Library

[15]

Panos Linos, Whitney Lucas, Sig Myers, and Ezekiel Maier. 2007. A metrics tool for multi-language software. In Proceedings of the 11th IASTED International Conference on Software Engineering and Applications. Citeseer, 324–329.

Digital Library

[16]

Fang Liu, Ge Li, Zhiyi Fu, Shuai Lu, Yiyang Hao, and Zhi Jin. 2022. Learning to recommend method names with global context. In Proceedings of the 44th International Conference on Software Engineering. 1294–1306.

Digital Library

[17]

Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Yujia Qin, Yaxi Lu, Yesai Wu, Xin Cong, Yankai Lin, Yingli Zhang, 2024. RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation. arXiv preprint arXiv:2402.16667 (2024).

[18]

Philip Mayer and Alexander Bauer. 2015. An empirical analysis of the utilization of multiple programming languages in open source projects. In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering. 1–10.

Digital Library

[19]

Philip Mayer, Michael Kirsch, and Minh Anh Le. 2017. On multi-language software development, cross-language links and accompanying tools: a survey of professional software developers. Journal of Software Engineering Research and Development 5 (2017), 1–33.

[20]

Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. 2024. Using an LLM to Help With Code Understanding. In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE). IEEE Computer Society, 881–881.

Digital Library

[21]

Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. A large scale study of programming languages and code quality in github. In Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. 155–165.

Digital Library

[22]

Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.

[23]

Federico Tomassetti and Marco Torchiano. 2014. An empirical assessment of polyglot-ism in github. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. 1–4.

Digital Library

[24]

Michel Wermelinger. 2023. Using GitHub Copilot to solve simple programming problems. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1. 172–178.

Digital Library

[25]

Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering. Springer Science & Business Media.

[26]

Tao Xiao, Christoph Treude, Hideaki Hata, and Kenichi Matsumoto. 2024. DevGPT: Studying Developer-ChatGPT Conversations. In Proceedings of the International Conference on Mining Software Repositories (MSR 2024).

Digital Library

[27]

Haoran Yang, Weile Lian, Shaowei Wang, and Haipeng Cai. 2023. Demystifying Issues, Challenges, and Solutions for Multilingual Software Development. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1840–1852.

Digital Library

Index Terms

Multi-language Software Development in the LLM Era: Insights from Practitioners’ Conversations with ChatGPT
1. Human-centered computing
  1. Collaborative and social computing
    1. Collaborative and social computing systems and tools
      1. Open source software
  2. Visualization
    1. Visualization techniques
2. Software and its engineering
  1. Software creation and management
    1. Collaboration in software development
  2. Software notations and tools

Index terms have been assigned to the content through auto-classification.

Recommendations

Chatting with AI: Deciphering Developer Conversations with ChatGPT
MSR '24: Proceedings of the 21st International Conference on Mining Software Repositories

Large Language Models (LLMs) have been widely adopted and are becoming ubiquitous and integral to software development. However, we have little knowledge as to how these tools are being used by software developers beyond anecdotal evidence and word-of-...
Design smells in multi-language systems and bug-proneness: a survival analysis
Abstract
Modern applications are often developed using a combination of programming languages and technologies. Multi-language systems offer opportunities for code reuse and the possibility to leverage the strengths of multiple programming languages. ...
DevGPT: Studying Developer-ChatGPT Conversations
MSR '24: Proceedings of the 21st International Conference on Mining Software Repositories

This paper introduces DevGPT, a dataset curated to explore how software developers interact with ChatGPT, a prominent large language model (LLM). The dataset encompasses 29,778 prompts and responses from ChatGPT, including 19,106 code snippets, and is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEM '24: Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

October 2024

633 pages

ISBN:9798400710476

DOI:10.1145/3674805

Copyright © 2024 ACM.

Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

CNPq
CAPES

Conference

ESEM '24

Sponsor:

SIGSOFT

ESEM '24: ACM / IEEE International Symposium on Empirical Software Engineering and Measurement

October 24 - 25, 2024

Barcelona, Spain

Acceptance Rates

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
54
Total Downloads

Downloads (Last 12 months)54
Downloads (Last 6 weeks)19

Reflects downloads up to 06 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents