skip to main content
10.1145/3674805.3690755acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article

Multi-language Software Development in the LLM Era: Insights from Practitioners’ Conversations with ChatGPT

Published: 24 October 2024 Publication History

Abstract

Non-trivial software systems are commonly developed using more than a single programming language. However, multi-language development is not straightforward. Nowadays, tools powered by Large Language Models (LLMs), such as ChatGPT, have been shown to successfully assist practitioners in several aspects of software development. This paper reports a preliminary study aimed to investigate to what extent ChatGPT is being used in multi-language development scenarios. Hence, we leveraged DevGPT, a dataset of conversations between software practitioners and ChatGPT. In total, we studied data from 3,584 conversations, comprising a total of 18,862 code snippets. Our analyses show that only 18.33% of the code snippets suggested by ChatGPT are written in the same programming language as the primary language in the repository where the conversation was shared. In an in-depth analysis, we observed expected scenarios, such as 31.54% of JavaScript snippets being suggested in CSS repositories However, we also unveiled surprising ones, such as Python snippets being largely suggested in C++ repositories. After a qualitative open card sorting of the conversations, we found that in 70% of them developers were asking for coding support while in 57% developers used ChatGPT as a tool to generate code. Our initial results indicate that not only LLMs are being used in multi-language development but also showcase the contexts in which such tools are assisting developers.

References

[1]
Harold Abelson and Gerald Jay Sussman. 1996. Structure and interpretation of computer programs. The MIT Press.
[2]
Mouna Abidi, Manel Grichi, and Foutse Khomh. 2019. Behind the scenes: developers’ perception of multi-language practices. In Proceedings of the 29th Annual International Conference on Computer Science and Software Engineering. 72–81.
[3]
Mouna Abidi, Manel Grichi, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2019. Code smells for multi-language systems. In Proceedings of the 24th European conference on pattern languages of programs. 1–13.
[4]
Mouna Abidi and Foutse Khomh. 2020. Towards the definition of patterns and code smells for multi-language systems. In Proceedings of the European Conference on Pattern Languages of Programs 2020. 1–13.
[5]
Mouna Abidi, Foutse Khomh, and Yann-Gaël Guéhéneuc. 2019. Anti-patterns for multi-language systems. In Proceedings of the 24th European conference on pattern languages of programs. 1–14.
[6]
Mouna Abidi, Md Saidur Rahman, Moses Openja, and Foutse Khomh. 2021. Are multi-language design smells fault-prone? An empirical study. ACM Transactions on Software Engineering and Methodology (TOSEM) 30, 3 (2021), 1–56.
[7]
Lucas Aguiar, Matheus Paixao, Rafael Carmo, Matheus Freitas, Eliakim Gama, Antonio Leal, and Edson Soares. 2024. Replication package for the paper "Multi-language Software Development in the LLM Era: Insights from Practitioners’ Conversations with ChatGPT". https://doi.org/10.5281/zenodo.13710992
[8]
Tegawendé F Bissyandé, Ferdian Thung, David Lo, Lingxiao Jiang, and Laurent Réveillere. 2013. Popularity, interoperability, and impact of programming languages in 100,000 open source projects. In 2013 IEEE 37th annual computer software and applications conference. IEEE, 303–312.
[9]
Daniel P Delorey, Charles D Knutson, and Christophe Giraud-Carrier. 2007. Programming language trends in open source development: An evaluation using data from all production phase sourceforge projects. In Second International Workshop on Public Data about Software Development (WoPDaSD’07).
[10]
Manel Grichi, Mouna Abidi, Fehmi Jaafar, Ellis E Eghan, and Bram Adams. 2020. On the impact of interlanguage dependencies in multilanguage systems empirical case study on java native interface applications (JNI). IEEE Transactions on Reliability 70, 1 (2020), 428–440.
[11]
Kevin Jesse, Toufique Ahmed, Premkumar T Devanbu, and Emily Morgan. 2023. Large Language Models and Simple, Stupid Bugs. arXiv preprint arXiv:2303.11455 (2023).
[12]
T Capers Jones. 2007. Estimating software costs. McGraw-Hill, Inc.
[13]
Pavneet Singh Kochhar, Dinusha Wijedasa, and David Lo. 2016. A large scale study of multiple programming languages and code quality. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 563–573.
[14]
Wen Li, Na Meng, Li Li, and Haipeng Cai. 2021. Understanding language selection in multi-language software projects on github. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion). IEEE, 256–257.
[15]
Panos Linos, Whitney Lucas, Sig Myers, and Ezekiel Maier. 2007. A metrics tool for multi-language software. In Proceedings of the 11th IASTED International Conference on Software Engineering and Applications. Citeseer, 324–329.
[16]
Fang Liu, Ge Li, Zhiyi Fu, Shuai Lu, Yiyang Hao, and Zhi Jin. 2022. Learning to recommend method names with global context. In Proceedings of the 44th International Conference on Software Engineering. 1294–1306.
[17]
Qinyu Luo, Yining Ye, Shihao Liang, Zhong Zhang, Yujia Qin, Yaxi Lu, Yesai Wu, Xin Cong, Yankai Lin, Yingli Zhang, 2024. RepoAgent: An LLM-Powered Open-Source Framework for Repository-level Code Documentation Generation. arXiv preprint arXiv:2402.16667 (2024).
[18]
Philip Mayer and Alexander Bauer. 2015. An empirical analysis of the utilization of multiple programming languages in open source projects. In Proceedings of the 19th International Conference on Evaluation and Assessment in Software Engineering. 1–10.
[19]
Philip Mayer, Michael Kirsch, and Minh Anh Le. 2017. On multi-language software development, cross-language links and accompanying tools: a survey of professional software developers. Journal of Software Engineering Research and Development 5 (2017), 1–33.
[20]
Daye Nam, Andrew Macvean, Vincent Hellendoorn, Bogdan Vasilescu, and Brad Myers. 2024. Using an LLM to Help With Code Understanding. In 2024 IEEE/ACM 46th International Conference on Software Engineering (ICSE). IEEE Computer Society, 881–881.
[21]
Baishakhi Ray, Daryl Posnett, Vladimir Filkov, and Premkumar Devanbu. 2014. A large scale study of programming languages and code quality in github. In Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. 155–165.
[22]
Donna Spencer. 2009. Card sorting: Designing usable categories. Rosenfeld Media.
[23]
Federico Tomassetti and Marco Torchiano. 2014. An empirical assessment of polyglot-ism in github. In Proceedings of the 18th International Conference on Evaluation and Assessment in Software Engineering. 1–4.
[24]
Michel Wermelinger. 2023. Using GitHub Copilot to solve simple programming problems. In Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1. 172–178.
[25]
Claes Wohlin, Per Runeson, Martin Höst, Magnus C Ohlsson, Björn Regnell, and Anders Wesslén. 2012. Experimentation in software engineering. Springer Science & Business Media.
[26]
Tao Xiao, Christoph Treude, Hideaki Hata, and Kenichi Matsumoto. 2024. DevGPT: Studying Developer-ChatGPT Conversations. In Proceedings of the International Conference on Mining Software Repositories (MSR 2024).
[27]
Haoran Yang, Weile Lian, Shaowei Wang, and Haipeng Cai. 2023. Demystifying Issues, Challenges, and Solutions for Multilingual Software Development. In 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE). IEEE, 1840–1852.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEM '24: Proceedings of the 18th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
October 2024
633 pages
ISBN:9798400710476
DOI:10.1145/3674805
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. ChatGPT
  2. Empirical Studies
  3. Multi-language Software Development

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • CNPq
  • CAPES

Conference

ESEM '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 54
    Total Downloads
  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)19
Reflects downloads up to 06 Jan 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media