skip to main content
10.1145/3597503.3639139acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

A Theory of Scientific Programming Efficacy

Published: 12 April 2024 Publication History

Abstract

Scientists write and maintain software artifacts to construct, validate, and apply scientific theories. Despite the centrality of software in their work, their practices differ significantly from those of professional software engineers. We sought to understand what makes scientists effective at their work and how software engineering practices and tools can be adapted to fit their workflows. We interviewed 25 scientists and support staff to understand their work. Then, we constructed a theory that relates six factors that contribute to their efficacy in creating and maintaining software systems. We present the theory in the form of a cycle of scientific computing efficacy and identify opportunities for improvement based on the six contributing factors.

References

[1]
2023. Best Practices. https://geodynamics.org/software/software-bp.
[2]
2023. Stack Overflow Developer Survey 2023. https://survey.stackoverflow.co/2023/
[3]
Victor R. Basili, Jeffrey C. Carver, Daniela Cruzes, Lorin M. Hochstein, Jeffrey K. Hollingsworth, Forrest Shull, and Marvin V. Zelkowitz. 2008. Understanding the High-Performance-Computing Community: A Software Engineer's Perspective. IEEE Software 25, 4 (2008), 29--36.
[4]
George E. P. Box. 1976. Science and Statistics. J. Amer. Statist. Assoc. 71, 356 (1976), 791--799. arXiv:https://www.tandfonline.com/doi/pdf/10.1080/01621459.1976.10480949
[5]
Lauren Cadwallader and Iain Hrynaszkiewicz. 2022. A survey of researchers' code sharing and code reuse practices, and assessment of interactive notebook prototype, Vol. 10. PeerJ.
[6]
Diana-Mirela Cândea and Aurora Szentagotai-Tătar. 2018. Shame-proneness, guilt-proneness and anxiety symptoms: A meta-analysis. Journal of anxiety disorders 58 (2018), 78--106.
[7]
Kathy Charmaz. 2014. Constructing grounded theory. Sage.
[8]
David A. Cook and Anthony R. Artino. 2016. Motivation to learn: an overview of contemporary theories. Medical Education 50 (2016), 997 -- 1014.
[9]
Ian A. Cosden, Kenton McHenry, and Daniel S. Katz. 2022. Research Software Engineers: Career Entry Points and Training Gaps. Computing in Science & Engineering 24, 6 (2022), 14--21.
[10]
Anshu Dubey. 2022. Good Practices for High-Quality Scientific Computing. Computing in Science & Engineering 24, 6 (2022), 72--76.
[11]
Steve M. Easterbrook and Timothy C. Johns. 2009. Engineering the Software for Understanding Climate Change. Computing in Science and Engineering 11, 6, 65--74.
[12]
European Organization For Nuclear Research and OpenAIRE. 2013. Zenodo.
[13]
Jonathan L Freedman, Sue A Wallington, and Evelyn Bless. 1967. Compliance without pressure: The effect of guilt. Journal of Personality and Social Psychology 7, 2p1 (1967), 117.
[14]
Kelsey R. Fulton, Anna Chan, Daniel Votipka, Michael Hicks, and Michelle L. Mazurek. 2021. Benefits and Drawbacks of Adopting a Secure Programming Language: Rust as a Case Study. In Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021). USENIX Association. https://www.usenix.org/conference/soups2021/presentation/fulton
[15]
Carole Goble. 2014. Better Software, Better Research. IEEE Internet Computing 18, 5 (2014), 4--8.
[16]
Jo Erskine Hannay, Carolyn MacLeod, Janice Singer, Hans Petter Langtangen, Dietmar Pfahl, and Greg Wilson. 2009. How do scientists develop and use scientific software?. In 2009 ICSE Workshop on Software Engineering for Computational Science and Engineering. 1--8.
[17]
Andrew Head, Fred Hohman, Titus Barik, Steven M Drucker, and Robert DeLine. 2019. Managing messes in computational notebooks. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1--12.
[18]
J. Hughes. 1989. Why Functional Programming Matters. Comput. J. 32, 2 (01 1989), 98--107. arXiv:https://academic.oup.com/comjnl/article-pdf/32/2/98/1445644/320098.pdf
[19]
Arne Johanson and Wilhelm Hasselbring. 2018. Software Engineering for Computational Science: Past, Present, Future. Computing in Science and Engineering 20, 2 (2018), 90--109.
[20]
Upulee Kanewala and James M. Bieman. 2014. Testing scientific software: A systematic literature review. Information and Software Technology 56, 10 (2014), 1219--1232.
[21]
Louise H. Kellogg, Lorraine J. Hwang, Rene Gassmöller, Wolfgang Bangerth, and Timo Heister. 2019. The Role of Scientific Communities in Creating Reusable Software: Lessons From Geophysics. Computing in Science & Engineering 21, 2 (2019), 25--35.
[22]
Mary Beth Kery, Amber Horvath, and Brad Myers. 2017. Variolite: Supporting Exploratory Programming by Data Scientists. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems (Denver, Colorado, USA) (CHI '17). Association for Computing Machinery, New York, NY, USA, 1265--1276.
[23]
Darren Key, Wen-Ding Li, and Kevin Ellis. 2022. I Speak, You Verify: Toward Trustworthy Neural Program Synthesis. arXiv:2210.00848 [cs.SE]
[24]
Sarah Killcoyne and John Boyle. 2009. Managing chaos: lessons learned developing software in the life sciences. Computing in science & engineering 11, 6 (2009), 20--29.
[25]
Amy J. Ko, Robin Abraham, Laura Beckwith, Alan Blackwell, Margaret Burnett, Martin Erwig, Chris Scaffidi, Joseph Lawrance, Henry Lieberman, Brad Myers, Mary Beth Rosson, Gregg Rothermel, Mary Shaw, and Susan Wiedenbeck. 2011. The State of the Art in End-User Software Engineering. ACM Comput. Surv. 43, 3, Article 21 (apr 2011), 44 pages.
[26]
Sandeep Kaur Kuttal, Bali Ong, Kate Kwasny, and Peter Robe. 2021. Trade-Offs for Substituting a Human with an Agent in a Pair Programming Context: The Good, the Bad, and the Ugly. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama, Japan) (CHI '21). Association for Computing Machinery, New York, NY, USA, Article 243, 20 pages.
[27]
Scott A. Lathrop, Katharine Cahill, Steven I. Gordon, Jennifer Houchins, Robert M. Panoff, and Aaron Weeden. 2020. Preparing a Computationally Literate Workforce. Computing in Science & Engineering 22, 4 (2020), 7--16.
[28]
StataCorp LLC. 2023. Stata. https://www.stata.com/
[29]
Will W.K. Ma and Allan H.K. Yuen. 2011. Understanding online knowledge sharing: An interpersonal relationship perspective. Computers & Education (2011).
[30]
David R. MacIver, Zac Hatfield-Dodds, and Many Other Contributors. 2019. Hypothesis: A new approach to property-based testing. Journal of Open Source Software 4, 43 (2019), 1891.
[31]
Lauren E. Margulieux, James Prather, Masoumeh Rahimi, and Gozde Cetin Uzun. 2023. Leverage Biology to Learn Rapidly From Mistakes Without Feeling Like a Failure. Computing in Science & Engineering 25, 2 (2023), 44--49.
[32]
Cory Merow, Josep M Serra-Diaz, Brian J Enquist, and Adam M Wilson. 2023. AI chatbots can boost scientific coding. Nature Ecology & Evolution (2023), 1--3.
[33]
Richard E. Nance. 1993. A History of Discrete Event Simulation Programming Languages. SIGPLAN Not. 28, 3 (mar 1993), 149--175.
[34]
Luke Nguyen-Hoan, Shayne Flint, and Ramesh Sankaranarayana. 2010. A Survey of Scientific Software Development. In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement (Bolzano-Bozen, Italy) (ESEM '10). Association for Computing Machinery, New York, NY, USA, Article 12, 10 pages.
[35]
Elizaveta Pertseva, Melinda Chang, Ulia Zaman, and Michael Coblenz. 2024. A Theory of Scientific Programming Efficacy Artifact.
[36]
Prakash Prabhu, Thomas B. Jablin, Arun Raman, Yun Zhang, Jialu Huang, Hanjun Kim, Nick P. Johnson, Feng Liu, Soumyadeep Ghosh, Stephen Beard, Taewook Oh, Matthew Zoufaly, David Walker, and David I. August. 2011. A Survey of the Practice of Computational Science. In State of the Practice Reports (Seattle, Washington) (SC '11). Association for Computing Machinery, New York, NY, USA, Article 19, 12 pages.
[37]
Tom Prickett, Julie Walters, Longzhi Yang, Morgan Harvey, and Tom Crick. 2020. Resilience and Effective Learning in First-Year Undergraduate Computer Science. In Proceedings of the 2020 ACM Conference on Innovation and Technology in Computer Science Education (Trondheim, Norway) (ITiCSE '20). Association for Computing Machinery, New York, NY, USA, 19--25.
[38]
Michael Resnick. 2020. Designing for Wide Walls. https://mres.medium.com/designing-for-wide-walls-323bdb4e7277
[39]
Russ Rew and Glenn Davis. 1990. NetCDF: an interface for scientific data access. IEEE computer graphics and applications 10, 4 (1990), 76--82.
[40]
Mark Sharratt and Abel Usoro. 2003. Understanding Knowledge-Sharing in Online Communities of Practice. Electronic Journal of Knowledge Management 1, 2 (2003), pp18--27.
[41]
Tim Storer. 2017. Bridging the Chasm: A Survey of Software Engineering Practice in Scientific Programming. ACM Comput. Surv. 50, 4, Article 47 (aug 2017), 32 pages.
[42]
The Software Carpentry Foundation. 2023. Software Carpentry: About Us. https://software-carpentry.org/about/
[43]
Huy Tu, Rishabh Agrawal, and Tim Menzies. 2020. The changing nature of computational science software. arXiv preprint arXiv:2003.05922 (2020).
[44]
Medha Umarji, Carolyn Seaman, A. Gunes Koru, and Hongfang Liu. 2009. Software Engineering Education for Bioinformatics. In 2009 22nd Conference on Software Engineering Education and Training. 216--223.
[45]
Igor Wiese, Ivanilton Polato, and Gustavo Pinto. 2020. Naming the Pain in Developing Scientific Software. IEEE Software 37, 4 (2020), 75--82.
[46]
G. Wilson. 2006. Software Carpentry: Getting Scientists to Write Better Code by Making Them More Productive. Computing in Science & Engineering 8, 6 (November-December 2006), 66--69. Summarizes the what and why of Version 3 of the course.
[47]
Greg Wilson. 2014. Software Carpentry: lessons learned. F1000Research 3 (2014).
[48]
Greg Wilson. 2023. Software Carpentry web site. http://software-carpentry.org. Main web site for Software Carpentry, replacing http://swc.scipy.org.
[49]
Greg Wilson, Dhavide A Aruliah, C Titus Brown, Neil P Chue Hong, Matt Davis, Richard T Guy, Steven HD Haddock, Kathryn D Huff, Ian M Mitchell, Mark D Plumbley, et al. 2014. Best practices for scientific computing. PLoS biology 12, 1 (2014), e1001745.
[50]
Greg Wilson, Jennifer Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy K Teal. 2017. Good enough practices in scientific computing. PLoS computational biology 13, 6 (2017), e1005510.
[51]
xarray Developers. 2023. Xarray documentation. https://docs.xarray.dev/en/stable/

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
May 2024
2942 pages
ISBN:9798400702174
DOI:10.1145/3597503
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

In-Cooperation

  • Faculty of Engineering of University of Porto

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 April 2024

Check for updates

Badges

Author Tags

  1. scientific programming
  2. qualitative study of programmers

Qualifiers

  • Research-article

Conference

ICSE '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 418
    Total Downloads
  • Downloads (Last 12 months)418
  • Downloads (Last 6 weeks)67
Reflects downloads up to 30 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media