skip to main content
research-article
Open access

What we eval in the shadows: a large-scale study of eval in R programs

Published: 15 October 2021 Publication History

Abstract

Most dynamic languages allow users to turn text into code using various functions, often named <tt>eval</tt>, with language-dependent semantics. The widespread use of these reflective functions hinders static analysis and prevents compilers from performing optimizations. This paper aims to provide a better sense of why programmers use <tt>eval</tt>. Understanding why <tt>eval</tt> is used in practice is key to finding ways to mitigate its negative impact. We have reasons to believe that reflective feature usage is language and application domain-specific; we focus on data science code written in R and compare our results to previous work that analyzed web programming in JavaScript. We analyze 49,296,059 calls to <tt>eval</tt> from 240,327 scripts extracted from 15,401 R packages. We find that <tt>eval</tt> is indeed in widespread use; R’s <tt>eval</tt> is more pervasive and arguably dangerous than what was previously reported for JavaScript.

Supplementary Material

Auxiliary Presentation Video (oopsla21main-p121-p-video.mp4)
This is a video of my talk at OOPSLA'21 on our paper - What We Eval in the Shadows. Most dynamic languages allow users to turn text into code using various functions, often named eval, with language-dependent semantics. The widespread use of these reflective functions hinders static analysis and prevents compilers from performing optimizations. This paper aims to provide a better sense of why programmers use eval. Understanding why eval is used in practice is key to finding ways to mitigate its negative impact. We have reasons to believe that reflective feature usage is language and application domain- specific; we focus on data science code written in R and compare our results to previous work that analyzed web programming in JavaScript. We analyze 49,296,059 calls to eval from 240,327 scripts extracted from 15,401 R packages. We find that eval is indeed in widespread use; R’s eval is more pervasive and arguably dangerous than what was previously reported for JavaScript.

References

[1]
JJ Allaire. 2021. rmarkdown: Dynamic Documents for R. https://github.com/rstudio/rmarkdown R package version 2.9.
[2]
Vincenzo Arceri and Isabella Mastroeni. 2021. Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval. ACM Trans. Priv. Secur., 24, 2 (2021), https://doi.org/10.1145/3426470
[3]
Julia Belyakova, Benjamin Chung, Jack Gelinas, Jameson Nash, Ross Tate, and Jan Vitek. 2020. World Age in Julia: Optimizing Method Dispatch in the Presence of Eval. Proc. ACM Program. Lang., 4, OOPSLA (2020), https://doi.org/10.1145/3428275
[4]
Jeff Bezanson, Jiahao Chen, Ben Chung, Stefan Karpinski, Viral B. Shah, Jan Vitek, and Lionel Zoubritzky. 2018. Julia: Dynamism and Performance Reconciled by Design. Proc. ACM Program. Lang., 2, OOPSLA (2018), https://doi.org/10.1145/3276490
[5]
Jeff Bezanson, Stefan Karpinski, Viral Shah, and Alan Edelman. 2012. Julia: A Fast Dynamic Language for Technical Computing. CoRR, abs/1209.5145 (2012).
[6]
Eric Bodden, Andreas Sewe, Jan Sinschek, Hela Oueslati, and Mira Mezini. 2011. Taming Reflection: Aiding Static Analysis in the Presence of Reflection and Custom Class Loaders. In International Conference on Software Engineering (ICSE). https://doi.org/10.1145/1985793.1985827
[7]
Oscar Callaú, Romain Robbes, Éric Tanter, and David Röthlisberger. 2013. How (and why) developers use the dynamic features of programming languages: the case of Smalltalk. Empir. Softw. Eng., 18, 6 (2013), https://doi.org/10.1007/s10664-012-9203-2
[8]
Zhifei Chen, Wanwangying Ma, Wei Lin, Lin Chen, Yanhui Li, and Baowen Xu. 2018. A study on the changes of dynamic feature code when fixing bugs: towards the benefits and costs of Python dynamic features. Sci. China Inf. Sci., 61, 1 (2018), https://doi.org/10.1007/s11432-017-9153-3
[9]
Aske Simon Christensen, Anders Møller, and Michael Schwartzbach. 2003. Precise Analysis of String Expressions. In Static Analysis Symposium (SAS). https://doi.org/10.1007/3-540-44898-5_1
[10]
Olivier Flückiger, Guido Chari, Jan Jecmen, Ming-Ho Yee, Jakob Hain, and Jan Vitek. 2019. R melts brains: an IR for first-class environments and lazy effectful arguments. In International Symposium on Dynamic Languages (DLS). https://doi.org/10.1145/3359619.3359744
[11]
Aviral Goel and Jan Vitek. 2019. On the design, implementation, and use of laziness in R. Proc. ACM Program. Lang., 3, OOPSLA (2019), https://doi.org/10.1145/3360579
[12]
Liang Gong. 2018. Dynamic Analysis for JavaScript Code. Ph.D. Dissertation. University of California, Berkeley. http://www.escholarship.org/uc/item/7n30n4kd
[13]
Ross Ihaka and Robert Gentleman. 1996. R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics, 5, 3 (1996), 299–314. http://www.amstat.org/publications/jcgs/
[14]
Simon Holm Jensen, Peter A. Jonsson, and Anders Møller. 2012. Remedying the Eval That Men Do. In International Symposium on Software Testing and Analysis (ISSTA). https://doi.org/10.1145/2338965.2336758
[15]
Filip Krikava and Jan Vitek. 2018. Tests from traces: automated unit test extraction for R. In International Symposium on Software Testing and Analysis (ISSTA). https://doi.org/10.1145/3213846.3213863
[16]
Sheng Liang and Gilad Bracha. 1998. Dynamic Class Loading in the Java Virtual Machine. In Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). https://doi.org/10.1145/286936.286945
[17]
Benjamin Livshits. 2015. In Defense of Soundiness: A Manifesto. Commun. ACM, 58, 2 (2015), https://doi.org/10.1145/2644805
[18]
John McCarthy. 1978. History of LISP. In History of programming languages (HOPL). https://doi.org/10.1145/960118.808387
[19]
Fadi Meawad, Gregor Richards, Floréal Morandat, and Jan Vitek. 2012. Eval begone!: semi-automated removal of eval from JavaScript programs. In Conference on Object-Oriented Programming Systems, Languages and Applications (OOPSLA). https://doi.org/10.1145/2384616.2384660
[20]
Floréal Morandat, Brandon Hill, Leo Osvald, and Jan Vitek. 2012. Evaluating the Design of the R Language: Objects and Functions for Data Analysis. In European Conference on Object-Oriented Programming (ECOOP). https://doi.org/10.1007/978-3-642-31057-7_6
[21]
R Core Team. 2017. R: A Language and Environment for Statistical Computing. https://www.R-project.org/
[22]
Gregor Richards, Christian Hammer, Brian Burg, and Jan Vitek. 2011. The Eval that Men Do: A Large-scale Study of the Use of Eval in JavaScript Applications. In European Conference on Object-Oriented Programming (ECOOP). https://doi.org/10.1007/978-3-642-22655-7_4
[23]
Gregor Richards, Sylvain Lesbrene, Brian Burg, and Jan Vitek. 2010. An Analysis of the Dynamic Behavior of JavaScript Programs. In Programming Language Design and Implementation Conference (PLDI). https://doi.org/10.1145/1809028.1806598
[24]
Ole Tange. 2018. GNU Parallel. Ole Tange. https://doi.org/10.5281/zenodo.1146014
[25]
Beibei Wang, Lin Chen, Wanwangying Ma, Zhifei Chen, and Baowen Xu. 2015. An empirical study on the impact of Python dynamic features on change-proneness. In International Conference on Software Engineering and Knowledge Engineering. https://doi.org/10.18293/SEKE2015-097
[26]
Hadley Wickham. 2016. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. https://ggplot2.tidyverse.org
[27]
Hadley Wickham. 2019. Welcome to the tidyverse. Journal of Open Source Software, 4, 43 (2019), https://doi.org/10.21105/joss.01686

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 5, Issue OOPSLA
October 2021
2001 pages
EISSN:2475-1421
DOI:10.1145/3492349
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 October 2021
Published in PACMPL Volume 5, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. dynamic languages
  2. eval

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)165
  • Downloads (Last 6 weeks)26
Reflects downloads up to 28 Jan 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media