research-article

Open access

A simple differentiable programming language

Authors:

Gordon D. PlotkinAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 4, Issue POPL

Article No.: 38, Pages 1 - 28

https://doi.org/10.1145/3371106

Published: 20 December 2019 Publication History

Abstract

Automatic differentiation plays a prominent role in scientific computing and in modern machine learning, often in the context of powerful programming systems. The relation of the various embodiments of automatic differentiation to the mathematical notion of derivative is not always entirely clear---discrepancies can arise, sometimes inadvertently. In order to study automatic differentiation in such programming contexts, we define a small but expressive programming language that includes a construct for reverse-mode differentiation. We give operational and denotational semantics for this language. The operational semantics employs popular implementation techniques, while the denotational semantics employs notions of differentiation familiar from real analysis. We establish that these semantics coincide.

References

[1]

Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016b. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16). ACM, 308–318.

Digital Library

[2]

Samson Abramsky and Achim Jung. 1994. Domain theory. In Handbook of Logic in Computer Science (Vol. 3), Samson Abramsky, Dov M. Gabbay, and T. S. E. Maibaum (Eds.). Oxford University Press, Inc., 1–168.

[3]

Akshay Agrawal, Akshay Naresh Modi, Alexandre Passos, Allen Lavoie, Ashish Agarwal, Asim Shankar, Igor Ganichev, Josh Levenberg, Mingsheng Hong, Rajat Monga, et al. 2019. TensorFlow Eager: A multi-stage, Python-embedded DSL for machine learning. arXiv preprint arXiv:1903.01855 (2019).

[4]

Shun-ichi Amari. 1996. Neural learning in structured parameter spaces — natural Riemannian gradient. In Advances in Neural Information Processing Systems 9, NIPS, M. Mozer, M. I. Jordan, and T. Petsche (Eds.). MIT Press, 127–133.

[5]

Atilim Günes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2018. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research 18, 153 (2018), 1–43.

Digital Library

[6]

Atilim Günes Baydin, Barak A. Pearlmutter, and Jeffrey Mark Siskind. 2016. Tricks from deep learning. CoRR abs/1611.03777 (2016).

[7]

Thomas Beck and Herbert Fischer. 1994. The if-problem in automatic differentiation. J. Comput. Appl. Math. 50, 1-3 (May 1994), 119–131.

Digital Library

[8]

James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: A CP U and GP U math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy), Vol. 4.

[9]

Yves Bertot and Pierre Castéran. 2013. Interactive theorem proving and program development: Coq’Art: the calculus of inductive constructions. Springer Science & Business Media.

[10]

Richard Blute, Thomas Ehrhard, and Christine Tasson. 2010. A convenient differential category. arXiv preprint arXiv:1006.3140 (2010).

[11]

Richard F Blute, J Robin B Cockett, and Robert AG Seely. 2009. Cartesian differential categories. Theory and Applications of Categories 22, 23 (2009), 622–672.

[12]

Antonio Bucciarelli, Thomas Ehrhard, and Giulio Manzonetto. 2010. Categorical models for simply typed resource calculi. Electronic Notes in Theoretical Computer Science 265 (2010), 213–230.

Digital Library

[13]

Bruce Christianson. 2012. A Leibniz notation for automatic differentiation. In Recent Advances in Algorithmic Differentiation, Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, and Andrea Walther (Eds.). Lecture Notes in Computational Science and Engineering, Vol. 87. Springer, 1–9.

[14]

Frank H. Clarke. 1990. Optimization and nonsmooth analysis. Classics in Applied Mathematics, Vol. 5. SIAM.

[15]

J Robin B Cockett, Geoff SH Cruttwell, and Jonathan D Gallagher. 2011. Differential restriction categories. Theory and Applications of Categories 25, 21 (2011), 537–613.

[16]

Leonardo Mendonça de Moura, Soonho Kong, Jeremy Avigad, Floris van Doorn, and Jakob von Raumer. 2015. The Lean Theorem Prover (System Description). In Automated Deduction - CADE-25 - 25th International Conference on Automated Deduction, Berlin, Germany, August 1-7, 2015, Proceedings (Lecture Notes in Computer Science), Amy P. Felty and Aart Middeldorp (Eds.), Vol. 9195. Springer, 378–388.

[17]

Pietro Di Gianantonio and Abbas Edalat. 2013. A language for differentiable functions. In Foundations of Software Science and Computation Structures, Frank Pfenning (Ed.). Springer, 337–352.

[18]

Abbas Edalat and André Lieutier. 2004. Domain theory and differential calculus (functions of one variable). Mathematical Structures in Computer Science 14, 6 (2004), 771–802.

Digital Library

[19]

Abbas Edalat and Mehrdad Maleki. 2018. Differential calculus with imprecise input and its logical framework. In Foundations of Software Science and Computation Structures - 21st International Conference, FOSSACS 2018. 459–475.

[20]

Thomas Ehrhard and Laurent Regnier. 2003. The differential lambda-calculus. Theo. Comp. Sci. 309, 1-3 (2003), 1–41.

Digital Library

[21]

Conal Elliott. 2018. The simple essence of automatic differentiation. In Proceedings of the ACM on Programming Languages (ICFP).

Digital Library

[22]

Matthias Felleisen and Daniel P. Friedman. 1987. Control operators, the SECD-machine and the λ-calculus. In Formal Description of Programming Concepts III, M. Wirsing (Ed.). Elsevier, 193–217.

[23]

H. Fischer. 2001. Automatic differentiation: root problem and branch problem. In Encyclopedia of Optimization, C. A. Floudas and P. M. Pardalos (Eds.). Vol. I. Kluwer Academic Publishers, 118–122.

[24]

Roy Frostig, Matthew James Johnson, and Chris Leary. 2018. Compiling machine learning programs via high-level tracing. Presented at SysML 2018.

[25]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press.

Digital Library

[26]

Andreas Griewank. 2000. Evaluating derivatives - principles and techniques of algorithmic differentiation. Frontiers in applied mathematics, Vol. 19. SIAM.

[27]

L. Hascoët and V. Pascual. 2013. The Tapenade automatic differentiation tool: principles, model, and specification. ACM Transactions On Mathematical Software 39, 3 (2013).

Digital Library

[28]

P. Iglesias-Zemmour. 2013. Diffeology. American Mathematical Society.

[29]

Andreas Kriegl and Peter W Michor. 1997. The convenient setting of global analysis. Vol. 53. American Mathematical Soc.

[30]

Dougal Maclaurin, David Duvenaud, and Ryan P Adams. 2015. Autograd: effortless gradients in Numpy. In ICML 2015 AutoML Workshop, Vol. 238.

[31]

Oleksandr Manzyuk. 2012. A simply typed λ-calculus of forward automatic differentiation. Electronic Notes in Theoretical Computer Science 286 (2012), 257 – 272.

Digital Library

[32]

Micaela Mayero. 2002. Using theorem proving for numerical analysis (correctness proof of an automatic differentiation algorithm). In Theorem Proving in Higher Order Logics, 15th International Conference, TPHOLs 2002, Hampton, VA, USA, August 20-23, 2002, Proceedings (Lecture Notes in Computer Science), Victor Carreño, César A. Muñoz, and Sofiène Tahar (Eds.), Vol. 2410. Springer, 246–262.

[33]

Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. 8024–8035.

[34]

Barak A. Pearlmutter and Jeffrey Mark Siskind. 2008. Reverse-mode AD in a functional framework: lambda the ultimate backpropagator. ACM Trans. Program. Lang. Syst. 30, 2 (2008), 7:1–7:36.

Digital Library

[35]

Daniel Selsam, Percy Liang, and David L. Dill. 2017. Developing bug-free machine learning systems with formal mathematics. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, 3047–3056.

[36]

Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, Simon Peyton Jones, and Christoph Koch. 2018. Efficient differentiable programming in a functional array-processing language. CoRR abs/1806.02136 (2018). arXiv: 1806.02136

[37]

Jeffrey Mark Siskind and Barak A. Pearlmutter. 2005. Perturbation confusion and referential transparency: correct functional implementation of forward-mode AD. In Implementation and Application of Functional Languages—17th International Workshop, IFL’05, A. Butterfield (Ed.). 1–9. Trinity College Dublin, Computer Science Department Technical Report TCD-CS-2005-60.

[38]

Jeffrey Mark Siskind and Barak A. Pearlmutter. 2008. Nesting forward-mode AD in a functional framework. Higher-Order and Symbolic Computation 21, 4 (2008), 361–376.

Digital Library

[39]

Justin Slepak, Olin Shivers, and Panagiotis Manolios. 2014. An array-oriented language with static rank polymorphism. In Proceedings of the 23rd European Symposium on Programming Languages and Systems - Volume 8410. Springer-Verlag, 27–46.

Digital Library

[40]

Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. 2013. Stochastic gradient descent with differentially private updates. In IEEE Global Conference on Signal and Information Processing, GlobalSIP 2013, Austin, TX, USA, December 3-5, 2013. IEEE, 245–248.

[41]

Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: a next-generation open source framework for deep learning. In Proceedings of Workshop on Machine Learning Systems (LearningSys) in the Twenty-Ninth Conference on Neural Information Processing Systems (NIPS), Vol. 5. 1–6.

[42]

W.F. Trench. 2003. Introduction to Real Analysis. Prentice Hall/Pearson Education.

[43]

M. Vákár, O. Kammar, and S. Staton. 2018. Diffeological spaces and semantics for differential programming. Presented at Domains XIII Workshop.

[44]

Bart van Merrienboer, Dan Moldovan, and Alexander B. Wiltschko. 2018. Tangent: automatic differentiation using sourcecode transformation for dynamically typed array programming. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada., S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). 6259–6268.

[45]

Fei Wang, Xilun Wu, Grégory M. Essertel, James M. Decker, and Tiark Rompf. 2018. Demystifying differentiable programming: shift/reset the penultimate backpropagator. CoRR abs/1803.10228 (2018). arXiv: 1803.10228

[46]

Yuan Yu, Martín Abadi, Paul Barham, Eugene Brevdo, Mike Burrows, Andy Davis, Jeff Dean, Sanjay Ghemawat, Tim Harley, Peter Hawkins, Michael Isard, Manjunath Kudlur, Rajat Monga, Derek Murray, and Xiaoqiang Zheng. 2018. Dynamic control flow in large-scale machine learning. In Proceedings of the Thirteenth EuroSys Conference (EuroSys ’18). ACM, Article 18, 15 pages.

Digital Library

Cited By

Bragg NFoster JZucker PEdwards JTaeumel M(2024)Scimitar: Functional Programs as Optimization ProblemsProceedings of the 2024 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3689492.3690051(96-112)Online publication date: 17-Oct-2024
https://dl.acm.org/doi/10.1145/3689492.3690051
Kerjean MPédrot PSobocinski PLago UEsparza J(2024)δ is for DialecticaProceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science10.1145/3661814.3662106(1-13)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3661814.3662106
Campion MDalla Preda MGiacobazzi RUrban C(2024)Monotonicity and the Precision of Program AnalysisProceedings of the ACM on Programming Languages10.1145/36328978:POPL(1629-1662)Online publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1145/3632897
Show More Cited By

Index Terms

A simple differentiable programming language

Recommendations

Demystifying differentiable programming: shift/reset the penultimate backpropagator

Deep learning has seen tremendous success over the past decade in computer vision, machine translation, and gameplay. This success rests crucially on gradient-descent optimization and the ability to “learn” parameters of a neural network by ...
Systematically differentiating parametric discontinuities

Emerging research in computer graphics, inverse problems, and machine learning requires us to differentiate and optimize parametric discontinuities. These discontinuities appear in object boundaries, occlusion, contact, and sudden change over time. In ...
Operational ontological approach to formal programming language specification

Development of formal programming language specifications is an important problem of theoretical and practical programming. The paper presents operational ontological approach to formal specification of programming languages. It defines formalism for ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 4, Issue POPL

January 2020

1984 pages

EISSN:2475-1421

DOI:10.1145/3377388

Issue’s Table of Contents

Copyright © 2019 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2019

Published in PACMPL Volume 4, Issue POPL

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

25
Total Citations
View Citations
3,254
Total Downloads

Downloads (Last 12 months)443
Downloads (Last 6 weeks)66

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bragg NFoster JZucker PEdwards JTaeumel M(2024)Scimitar: Functional Programs as Optimization ProblemsProceedings of the 2024 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software10.1145/3689492.3690051(96-112)Online publication date: 17-Oct-2024
https://dl.acm.org/doi/10.1145/3689492.3690051
Kerjean MPédrot PSobocinski PLago UEsparza J(2024)δ is for DialecticaProceedings of the 39th Annual ACM/IEEE Symposium on Logic in Computer Science10.1145/3661814.3662106(1-13)Online publication date: 8-Jul-2024
https://dl.acm.org/doi/10.1145/3661814.3662106
Campion MDalla Preda MGiacobazzi RUrban C(2024)Monotonicity and the Precision of Program AnalysisProceedings of the ACM on Programming Languages10.1145/36328978:POPL(1629-1662)Online publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1145/3632897
Lucatelli Nunes FVákár M(2024)Automatic differentiation for ML-family languages: Correctness via logical relationsMathematical Structures in Computer Science10.1017/S0960129524000215(1-60)Online publication date: 21-Oct-2024
https://doi.org/10.1017/S0960129524000215
(2024)ReferencesFoundations of Quantum Programming10.1016/B978-0-44-315942-8.00030-7(435-447)Online publication date: 2024
https://doi.org/10.1016/B978-0-44-315942-8.00030-7
Ying M(2024)ProspectsFoundations of Quantum Programming10.1016/B978-0-44-315942-8.00027-7(367-373)Online publication date: 2024
https://doi.org/10.1016/B978-0-44-315942-8.00027-7
Tang YDing ZJankov DYuan BBourgeois DJermaine CKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)Auto-differentiation of relational computations for very large scale machine learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619806(33581-33598)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619806
Lee WPark SAiken AKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)On the correctness of automatic differentiation for neural networks with machine-representable parametersProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619197(19094-19140)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619197
Eriksson OPalmkvist VBroman DDe Roover CRumpe BShaikhha A(2023)Partial Evaluation of Automatic Differentiation for Differential-Algebraic Equations SolversProceedings of the 22nd ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences10.1145/3624007.3624054(57-71)Online publication date: 22-Oct-2023
https://dl.acm.org/doi/10.1145/3624007.3624054
Fang WYing MWu X(2023)Differentiable Quantum Programming with Unbounded LoopsACM Transactions on Software Engineering and Methodology10.1145/361717833:1(1-63)Online publication date: 23-Nov-2023
https://dl.acm.org/doi/10.1145/3617178
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents