Work in Progress

Evaluating Human-AI Partnership for LLM-based Code Migration

Authors:

Behrooz Omidvar Tehrani,

Anmol AnubhaiAuthors Info & Claims

CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems

Article No.: 133, Pages 1 - 8

https://doi.org/10.1145/3613905.3650896

Published: 11 May 2024 Publication History

Abstract

The potential of Generative AI, especially Large Language Models (LLMs), to transform software development is remarkable. In this paper, we focus on one area in software development called “code migration”. We define code migration as the process of transitioning the language version of a code repository by converting both the source code and its dependencies. Carefully designing an effective human-AI partnership is essential for boosting developer productivity and faster migrations when performing code migrations. Though human-AI partnerships have been generally explored in the literature, their application to code migrations remains largely unexamined. In this work, we leverage an LLM-based code migration tool called Amazon Q Code Transformation to conduct semi-structured interviews with 11 participants undertaking code migrations. We discuss human’s role in the human-AI partnership (human as a director and a reviewer) and define a trust framework based on various model outcomes to earn trust with LLMs. The guidelines presented in this paper offer a vital starting point for designing human-AI partnerships that effectively augment and complement human capabilities in software development with Generative AI.

Supplemental Material

MP4 File - Video Preview

Video Preview

Download
8.45 MB

MP4 File

Talk Video

Download
8.45 MB

References

[1]

Kamel Alrashedy. 2023. Language Models are Better Bug Detector Through Code-Pair Classification. arXiv preprint arXiv:2311.07957 (2023).

[2]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N. Bennett, Kori Inkpen, Jaime Teevan, Ruth Kikin-Gil, and Eric Horvitz. 2019. Guidelines for Human-AI Interaction(CHI ’19).

[3]

Shraddha Barke, Michael B James, and Nadia Polikarpova. 2023. Grounded copilot: How programmers interact with code-generating models. Proceedings of the ACM on Programming Languages 7, OOPSLA1 (2023), 85–111.

Digital Library

[4]

Mark Chen, Jerry Tworek, Heewoo Jun, Qiming Yuan, Henrique Ponde de Oliveira Pinto, Jared Kaplan, Harri Edwards, Yuri Burda, Nicholas Joseph, Greg Brockman, 2021. Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374 (2021).

[5]

Xinyun Chen, Maxwell Lin, Nathanael Schärli, and Denny Zhou. 2023. Teaching large language models to self-debug. arXiv preprint arXiv:2304.05128 (2023).

[6]

Nazli Cila. 2022. Designing Human-Agent Collaborations: Commitment, Responsiveness, and Support(CHI ’22).

[7]

Lucas Colusso, Cynthia L Bennett, Gary Hsieh, and Sean A Munson. 2017. Translational resources: Reducing the gap between academic research and HCI practice. In Proceedings of the 2017 conference on designing interactive systems. 957–968.

Digital Library

[8]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[9]

Erik Du Plessis. 1994. Recognition versus recall. Journal of Advertising Research 34, 3 (1994), 75–92.

[10]

Shubhang Shekhar Dvivedi, Vyshnav Vijay, Sai Leela Rahul Pujari, Shoumik Lodh, and Dhruv Kumar. 2023. A Comparative Analysis of Large Language Models for Code Documentation Generation. arXiv preprint arXiv:2312.10349 (2023).

[11]

Upol Ehsan, Q. Vera Liao, Michael Muller, Mark O. Riedl, and Justin D. Weisz. 2021. Expanding Explainability: Towards Social Transparency in AI Systems(CHI ’21).

[12]

Judith E Fan, Monica Dinculescu, and David Ha. 2019. Collabdraw: an environment for collaborative sketching with an artificial agent. In Proceedings of the 2019 on Creativity and Cognition. 556–561.

Digital Library

[13]

Cheng-Zhi Anna Huang, Curtis Hawthorne, Adam Roberts, Monica Dinculescu, James Wexler, Leon Hong, and Jacob Howcroft. 2019. The bach doodle: Approachable music composition with machine learning at scale. arXiv preprint arXiv:1907.06637 (2019).

[14]

Carlos E Jimenez, John Yang, Alexander Wettig, Shunyu Yao, Kexin Pei, Ofir Press, and Karthik Narasimhan. 2023. SWE-bench: Can Language Models Resolve Real-World GitHub Issues?arXiv preprint arXiv:2310.06770 (2023).

[15]

Sungmin Kang, Gabin An, and Shin Yoo. 2023. A Preliminary Evaluation of LLM-Based Fault Localization. arXiv preprint arXiv:2308.05487 (2023).

[16]

Pegah Karimi, Jeba Rezwana, Safat Siddiqui, Mary Lou Maher, and Nasrin Dehbozorgi. 2020. Creative sketching partner: an analysis of human-AI co-creativity. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 221–230.

Digital Library

[17]

Sunnie S. Y. Kim, Elizabeth Anne Watkins, Olga Russakovsky, Ruth Fong, and Andrés Monroy-Hernández. 2023. "Help Me Help the AI": Understanding How Explainability Can Support Human-AI Interaction(CHI ’23).

[18]

Marie-Anne Lachaux, Baptiste Roziere, Lowik Chanussot, and Guillaume Lample. 2020. Unsupervised translation of programming languages. arXiv preprint arXiv:2006.03511 (2020).

[19]

Chen Ling, Xujiang Zhao, Jiaying Lu, Chengyuan Deng, Can Zheng, Junxiang Wang, Tanmoy Chowdhury, Yun Li, Hejie Cui, Tianjiao Zhao, 2023. Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models. arXiv preprint arXiv:2305.18703 (2023).

[20]

Shuai Ma, Ying Lei, Xinru Wang, Chengbo Zheng, Chuhan Shi, Ming Yin, and Xiaojuan Ma. 2023. Who Should I Trust: AI or Myself? Leveraging Human and AI Correctness Likelihood to Promote Appropriate Trust in AI-Assisted Decision-Making(CHI ’23).

[21]

Andrew M McNutt, Chenglong Wang, Robert A Deline, and Steven M Drucker. 2023. On the design of ai-powered code assistants for notebooks. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–16.

Digital Library

[22]

Jane N Mosier and Sidney L Smith. 1986. Application of guidelines for designing user interface software. Behaviour & information technology 5, 1 (1986), 39–46.

[23]

Changhoon Oh, Jungwoo Song, Jinhan Choi, Seonghyeon Kim, Sungwoo Lee, and Bongwon Suh. 2018. I lead, you help but only with enough details: Understanding user experience of co-creation with artificial intelligence. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[24]

Cheng Peng, Xi Yang, Aokun Chen, Kaleb E Smith, Nima PourNejatian, Anthony B Costa, Cheryl Martin, Mona G Flores, Ying Zhang, Tanja Magoc, 2023. A Study of Generative Large Language Model for Medical Research and Healthcare. arXiv preprint arXiv:2305.13523 (2023).

[25]

Marc Pinski, Martin Adam, and Alexander Benlian. 2023. AI Knowledge: Improving AI Delegation through Human Enablement. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems. 1–17.

Digital Library

[26]

Atsushi Shirafuji, Yusuke Oda, Jun Suzuki, Makoto Morishita, and Yutaka Watanobe. 2023. Refactoring Programs Using Large Language Models with Few-Shot Examples. arXiv preprint arXiv:2311.11690 (2023).

[27]

Lingyun Sun, Zhuoshu Li, Yuyang Zhang, Yanzhen Liu, Shanghua Lou, and Zhibin Zhou. 2021. Capturing the Trends, Applications, Issues, and Potential Strategies of Designing Transparent AI Agents(CHI EA ’21).

[28]

AWS AI Team. [n. d.]. Amazon Q Code Transformation (Preview). https://aws.amazon.com/q/aws/code-transformation/. Accessed: 2023-12-30.

[29]

Linda Tetzlaff and David R Schwartz. 1991. The use of guidelines in interface design. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. 329–333.

Digital Library

[30]

Michele Tufano, Cody Watson, Gabriele Bavota, Massimiliano Di Penta, Martin White, and Denys Poshyvanyk. 2018. Deep learning similarities from different representations of source code. In Proceedings of the 15th international conference on mining software repositories. 542–553.

Digital Library

[31]

Fengjie Wang, Xuye Liu, Oujing Liu, Ali Neshati, Tengfei Ma, Min Zhu, and Jian Zhao. 2023. Slide4N: Creating Presentation Slides from Computational Notebooks with Human-AI Collaboration(CHI ’23).

[32]

Thomas Weber, Heinrich Hußmann, Zhiwei Han, Stefan Matthes, and Yuanting Liu. 2020. Draw with me: Human-in-the-loop for image restoration. In Proceedings of the 25th International Conference on Intelligent User Interfaces. 243–253.

Digital Library

[33]

Justin D Weisz, Michael Muller, Jessica He, and Stephanie Houde. 2023. Toward general design principles for generative AI applications. arXiv preprint arXiv:2301.05578 (2023).

[34]

Justin D Weisz, Michael Muller, Stephanie Houde, John Richards, Steven I Ross, Fernando Martinez, Mayank Agarwal, and Kartik Talamadupula. 2021. Perfection not required? Human-AI partnerships in code translation. In 26th International Conference on Intelligent User Interfaces. 402–412.

Digital Library

[35]

Burak Yetiştiren, Işık Özsoy, Miray Ayerdem, and Eray Tüzün. 2023. Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT. arXiv preprint arXiv:2304.10778 (2023).

[36]

Qiaoning Zhang, Matthew L Lee, and Scott Carter. 2022. You Complete Me: Human-AI Teams and Complementary Expertise(CHI ’22).

Index Terms

Evaluating Human-AI Partnership for LLM-based Code Migration
1. Human-centered computing
  1. Human computer interaction (HCI)
    1. HCI design and evaluation methods
      1. User studies
    2. Interaction paradigms
      1. Collaborative interaction
2. Software and its engineering
  1. Software creation and management
    1. Designing software

Recommendations

Reasoning and Planning with Large Language Models in Code Development
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Large Language Models (LLMs) are revolutionizing the field of code development by leveraging their deep understanding of code patterns, syntax, and semantics to assist developers in various tasks, from code generation and testing to code understanding ...
Code migration control in large scale loosely coupled distributed systems
Mobility '07: Proceedings of the 4th international conference on mobile technology, applications, and systems and the 1st international symposium on Computer human interaction in mobile technology

Large scale loosely coupled PCs can organize a cluster and form the grid computing on sharing each processing power; power of PCs, code migration and transaction distribution characterize the performance of distributed systems. This paper describes ...
Code migration with statistical machine translation
SoftwareMining 2016: Proceedings of the 5th International Workshop on Software Mining

In modern software development, developers often need to migrate code written for one platform in a programming language to another language for a different platform. The migration process is often performed manually or semi-automatically, in which ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems

May 2024

4761 pages

ISBN:9798400703317

DOI:10.1145/3613905

Editors:
Florian Floyd Mueller
Monash University
,
Penny Kyburz
The Australian National University
,
Julie R. Williamson
University of Glasgow
,
Corina Sas
Lancaster University

Copyright © 2024 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2024

Check for updates

Author Tags

Qualifiers

Work in progress
Research
Refereed limited

Conference

CHI '24

Sponsor:

CHI '24: CHI Conference on Human Factors in Computing Systems

May 11 - 16, 2024

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
731
Total Downloads

Downloads (Last 12 months)731
Downloads (Last 6 weeks)79

Reflects downloads up to 01 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View full text|Download PDF

View Table of Conten