research-article

Open access

A Theory of Scientific Programming Efficacy

Authors:

Elizaveta Pertseva,

Michael CoblenzAuthors Info & Claims

ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

Article No.: 192, Pages 1 - 12

https://doi.org/10.1145/3597503.3639139

Published: 12 April 2024 Publication History

Abstract

Scientists write and maintain software artifacts to construct, validate, and apply scientific theories. Despite the centrality of software in their work, their practices differ significantly from those of professional software engineers. We sought to understand what makes scientists effective at their work and how software engineering practices and tools can be adapted to fit their workflows. We interviewed 25 scientists and support staff to understand their work. Then, we constructed a theory that relates six factors that contribute to their efficacy in creating and maintaining software systems. We present the theory in the form of a cycle of scientific computing efficacy and identify opportunities for improvement based on the six contributing factors.

References

[1]

2023. Best Practices. https://geodynamics.org/software/software-bp.

[2]

2023. Stack Overflow Developer Survey 2023. https://survey.stackoverflow.co/2023/

[3]

Victor R. Basili, Jeffrey C. Carver, Daniela Cruzes, Lorin M. Hochstein, Jeffrey K. Hollingsworth, Forrest Shull, and Marvin V. Zelkowitz. 2008. Understanding the High-Performance-Computing Community: A Software Engineer's Perspective. IEEE Software 25, 4 (2008), 29--36.

Digital Library

[4]

George E. P. Box. 1976. Science and Statistics. J. Amer. Statist. Assoc. 71, 356 (1976), 791--799. arXiv:https://www.tandfonline.com/doi/pdf/10.1080/01621459.1976.10480949

[5]

Lauren Cadwallader and Iain Hrynaszkiewicz. 2022. A survey of researchers' code sharing and code reuse practices, and assessment of interactive notebook prototype, Vol. 10. PeerJ.

[6]

Diana-Mirela Cândea and Aurora Szentagotai-Tătar. 2018. Shame-proneness, guilt-proneness and anxiety symptoms: A meta-analysis. Journal of anxiety disorders 58 (2018), 78--106.

[7]

Kathy Charmaz. 2014. Constructing grounded theory. Sage.

[8]

David A. Cook and Anthony R. Artino. 2016. Motivation to learn: an overview of contemporary theories. Medical Education 50 (2016), 997 -- 1014.

[9]

Ian A. Cosden, Kenton McHenry, and Daniel S. Katz. 2022. Research Software Engineers: Career Entry Points and Training Gaps. Computing in Science & Engineering 24, 6 (2022), 14--21.

Digital Library

[10]

Anshu Dubey. 2022. Good Practices for High-Quality Scientific Computing. Computing in Science & Engineering 24, 6 (2022), 72--76.

Digital Library

[11]

Steve M. Easterbrook and Timothy C. Johns. 2009. Engineering the Software for Understanding Climate Change. Computing in Science and Engineering 11, 6, 65--74.

Digital Library

[12]

European Organization For Nuclear Research and OpenAIRE. 2013. Zenodo.

[13]

Jonathan L Freedman, Sue A Wallington, and Evelyn Bless. 1967. Compliance without pressure: The effect of guilt. Journal of Personality and Social Psychology 7, 2p1 (1967), 117.

[14]

Kelsey R. Fulton, Anna Chan, Daniel Votipka, Michael Hicks, and Michelle L. Mazurek. 2021. Benefits and Drawbacks of Adopting a Secure Programming Language: Rust as a Case Study. In Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021). USENIX Association. https://www.usenix.org/conference/soups2021/presentation/fulton

[15]

Carole Goble. 2014. Better Software, Better Research. IEEE Internet Computing 18, 5 (2014), 4--8.

[16]

Jo Erskine Hannay, Carolyn MacLeod, Janice Singer, Hans Petter Langtangen, Dietmar Pfahl, and Greg Wilson. 2009. How do scientists develop and use scientific software?. In 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering. 1--8.

Digital Library

[17]

Andrew Head, Fred Hohman, Titus Barik, Steven M Drucker, and Robert DeLine. 2019. Managing messes in computational notebooks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1--12.

Digital Library

[18]

J. Hughes. 1989. Why Functional Programming Matters. Comput. J. 32, 2 (01 1989), 98--107. arXiv:https://academic.oup.com/comjnl/article-pdf/32/2/98/1445644/320098.pdf

Digital Library

[19]

Arne Johanson and Wilhelm Hasselbring. 2018. Software Engineering for Computational Science: Past, Present, Future. Computing in Science and Engineering 20, 2 (2018), 90--109.

Digital Library

[20]

Upulee Kanewala and James M. Bieman. 2014. Testing scientific software: A systematic literature review. Information and Software Technology 56, 10 (2014), 1219--1232.

Digital Library

[21]

Louise H. Kellogg, Lorraine J. Hwang, Rene Gassmöller, Wolfgang Bangerth, and Timo Heister. 2019. The Role of Scientific Communities in Creating Reusable Software: Lessons From Geophysics. Computing in Science & Engineering 21, 2 (2019), 25--35.

Digital Library

[22]

Mary Beth Kery, Amber Horvath, and Brad Myers. 2017. Variolite: Supporting Exploratory Programming by Data Scientists. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI '17). Association for Computing Machinery, New York, NY, USA, 1265--1276.

Digital Library

[23]

Darren Key, Wen-Ding Li, and Kevin Ellis. 2022. I Speak, You Verify: Toward Trustworthy Neural Program Synthesis. arXiv:2210.00848 [cs.SE]

[24]

Sarah Killcoyne and John Boyle. 2009. Managing chaos: lessons learned developing software in the life sciences. Computing in science & engineering 11, 6 (2009), 20--29.

[25]

Amy J. Ko, Robin Abraham, Laura Beckwith, Alan Blackwell, Margaret Burnett, Martin Erwig, Chris Scaffidi, Joseph Lawrance, Henry Lieberman, Brad Myers, Mary Beth Rosson, Gregg Rothermel, Mary Shaw, and Susan Wiedenbeck. 2011. The State of the Art in End-User Software Engineering. ACM Comput. Surv. 43, 3, Article 21 (apr 2011), 44 pages.

Digital Library

[26]

Sandeep Kaur Kuttal, Bali Ong, Kate Kwasny, and Peter Robe. 2021. Trade-Offs for Substituting a Human with an Agent in a Pair Programming Context: The Good, the Bad, and the Ugly. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 243, 20 pages.

Digital Library

[27]

Scott A. Lathrop, Katharine Cahill, Steven I. Gordon, Jennifer Houchins, Robert M. Panoff, and Aaron Weeden. 2020. Preparing a Computationally Literate Workforce. Computing in Science & Engineering 22, 4 (2020), 7--16.

Digital Library

[28]

StataCorp LLC. 2023. Stata. https://www.stata.com/

[29]

Will W.K. Ma and Allan H.K. Yuen. 2011. Understanding online knowledge sharing: An interpersonal relationship perspective. Computers & Education (2011).

[30]

David R. MacIver, Zac Hatfield-Dodds, and Many Other Contributors. 2019. Hypothesis: A new approach to property-based testing. Journal of Open Source Software 4, 43 (2019), 1891.

[31]

Lauren E. Margulieux, James Prather, Masoumeh Rahimi, and Gozde Cetin Uzun. 2023. Leverage Biology to Learn Rapidly From Mistakes Without Feeling Like a Failure. Computing in Science & Engineering 25, 2 (2023), 44--49.

Digital Library

[32]

Cory Merow, Josep M Serra-Diaz, Brian J Enquist, and Adam M Wilson. 2023. AI chatbots can boost scientific coding. Nature Ecology & Evolution (2023), 1--3.

[33]

Richard E. Nance. 1993. A History of Discrete Event Simulation Programming Languages. SIGPLAN Not. 28, 3 (mar 1993), 149--175.

Digital Library

[34]

Luke Nguyen-Hoan, Shayne Flint, and Ramesh Sankaranarayana. 2010. A Survey of Scientific Software Development. In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (Bolzano-Bozen, Italy) (ESEM '10). Association for Computing Machinery, New York, NY, USA, Article 12, 10 pages.

Digital Library

[35]

Elizaveta Pertseva, Melinda Chang, Ulia Zaman, and Michael Coblenz. 2024. A Theory of Scientific Programming Efficacy Artifact.

[36]

Prakash Prabhu, Thomas B. Jablin, Arun Raman, Yun Zhang, Jialu Huang, Hanjun Kim, Nick P. Johnson, Feng Liu, Soumyadeep Ghosh, Stephen Beard, Taewook Oh, Matthew Zoufaly, David Walker, and David I. August. 2011. A Survey of the Practice of Computational Science. In State of the Practice Reports (Seattle, Washington) (SC '11). Association for Computing Machinery, New York, NY, USA, Article 19, 12 pages.

Digital Library

[37]

Tom Prickett, Julie Walters, Longzhi Yang, Morgan Harvey, and Tom Crick. 2020. Resilience and Effective Learning in First-Year Undergraduate Computer Science. In Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education (Trondheim, Norway) (ITiCSE '20). Association for Computing Machinery, New York, NY, USA, 19--25.

Digital Library

[38]

Michael Resnick. 2020. Designing for Wide Walls. https://mres.medium.com/designing-for-wide-walls-323bdb4e7277

[39]

Russ Rew and Glenn Davis. 1990. NetCDF: an interface for scientific data access. IEEE computer graphics and applications 10, 4 (1990), 76--82.

[40]

Mark Sharratt and Abel Usoro. 2003. Understanding Knowledge-Sharing in Online Communities of Practice. Electronic Journal of Knowledge Management 1, 2 (2003), pp18--27.

[41]

Tim Storer. 2017. Bridging the Chasm: A Survey of Software Engineering Practice in Scientific Programming. ACM Comput. Surv. 50, 4, Article 47 (aug 2017), 32 pages.

Digital Library

[42]

The Software Carpentry Foundation. 2023. Software Carpentry: About Us. https://software-carpentry.org/about/

[43]

Huy Tu, Rishabh Agrawal, and Tim Menzies. 2020. The changing nature of computational science software. arXiv preprint arXiv:2003.05922 (2020).

[44]

Medha Umarji, Carolyn Seaman, A. Gunes Koru, and Hongfang Liu. 2009. Software Engineering Education for Bioinformatics. In 2009 22nd Conference on Software Engineering Education and Training. 216--223.

Digital Library

[45]

Igor Wiese, Ivanilton Polato, and Gustavo Pinto. 2020. Naming the Pain in Developing Scientific Software. IEEE Software 37, 4 (2020), 75--82.

Digital Library

[46]

G. Wilson. 2006. Software Carpentry: Getting Scientists to Write Better Code by Making Them More Productive. Computing in Science & Engineering 8, 6 (November-December 2006), 66--69. Summarizes the what and why of Version 3 of the course.

Digital Library

[47]

Greg Wilson. 2014. Software Carpentry: lessons learned. F1000Research 3 (2014).

[48]

Greg Wilson. 2023. Software Carpentry web site. http://software-carpentry.org. Main web site for Software Carpentry, replacing http://swc.scipy.org.

[49]

Greg Wilson, Dhavide A Aruliah, C Titus Brown, Neil P Chue Hong, Matt Davis, Richard T Guy, Steven HD Haddock, Kathryn D Huff, Ian M Mitchell, Mark D Plumbley, et al. 2014. Best practices for scientific computing. PLoS biology 12, 1 (2014), e1001745.

[50]

Greg Wilson, Jennifer Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy K Teal. 2017. Good enough practices in scientific computing. PLoS computational biology 13, 6 (2017), e1005510.

[51]

xarray Developers. 2023. Xarray documentation. https://docs.xarray.dev/en/stable/

Index Terms

A Theory of Scientific Programming Efficacy

Recommendations

Bridging the Chasm: A Survey of Software Engineering Practice in Scientific Programming

The use of software is pervasive in all fields of science. Associated software development efforts may be very large, long lived, and complex, requiring the commitment of significant resources. However, several authors have argued that the “gap” or “...
Scientific Computing's Productivity Gridlock: How Software Engineering Can Help

Hardware improvements do little to improve real productivity in scientific programming. Indeed, the dominant barriers to productivity improvement are now in the software processes. To break the gridlock, we must establish a degree of cooperation and ...
Testing Scientific Programs

The Automated Testing System (ATS) is an open source, Python-based tool for automating the testing of applications, especially scientific simulations. It's especially designed to support the work of a team of subject-matter experts.

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering

May 2024

2942 pages

ISBN:9798400702174

DOI:10.1145/3597503

Co-chairs:
Ana Paiva,
Rui Abreu,
Program Co-chairs:
Abhik Roychoudhury,
Margaret Storey

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 April 2024

Check for updates

Badges

Artifacts Available / v1.1

Author Tags

Qualifiers

Research-article

Conference

ICSE '24

Sponsor:

SIGSOFT

ICSE '24: IEEE/ACM 46th International Conference on Software Engineering

April 14 - 20, 2024

Lisbon, Portugal

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

2025 IEEE/ACM 46th International Conference on Software Engineering

April 26 - May 3, 2025

Ottawa , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
418
Total Downloads

Downloads (Last 12 months)418
Downloads (Last 6 weeks)67

Reflects downloads up to 30 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten