skip to main content
research-article
Open access

A simple differentiable programming language

Published: 20 December 2019 Publication History

Abstract

Automatic differentiation plays a prominent role in scientific computing and in modern machine learning, often in the context of powerful programming systems. The relation of the various embodiments of automatic differentiation to the mathematical notion of derivative is not always entirely clear---discrepancies can arise, sometimes inadvertently. In order to study automatic differentiation in such programming contexts, we define a small but expressive programming language that includes a construct for reverse-mode differentiation. We give operational and denotational semantics for this language. The operational semantics employs popular implementation techniques, while the denotational semantics employs notions of differentiation familiar from real analysis. We establish that these semantics coincide.

References

[1]
Martin Abadi, Andy Chu, Ian Goodfellow, H. Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016b. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (CCS ’16). ACM, 308–318.
[2]
Samson Abramsky and Achim Jung. 1994. Domain theory. In Handbook of Logic in Computer Science (Vol. 3), Samson Abramsky, Dov M. Gabbay, and T. S. E. Maibaum (Eds.). Oxford University Press, Inc., 1–168.
[3]
Akshay Agrawal, Akshay Naresh Modi, Alexandre Passos, Allen Lavoie, Ashish Agarwal, Asim Shankar, Igor Ganichev, Josh Levenberg, Mingsheng Hong, Rajat Monga, et al. 2019. TensorFlow Eager: A multi-stage, Python-embedded DSL for machine learning. arXiv preprint arXiv:1903.01855 (2019).
[4]
Shun-ichi Amari. 1996. Neural learning in structured parameter spaces — natural Riemannian gradient. In Advances in Neural Information Processing Systems 9, NIPS, M. Mozer, M. I. Jordan, and T. Petsche (Eds.). MIT Press, 127–133.
[5]
Atilim Günes Baydin, Barak A. Pearlmutter, Alexey Andreyevich Radul, and Jeffrey Mark Siskind. 2018. Automatic differentiation in machine learning: a survey. Journal of Machine Learning Research 18, 153 (2018), 1–43.
[6]
Atilim Günes Baydin, Barak A. Pearlmutter, and Jeffrey Mark Siskind. 2016. Tricks from deep learning. CoRR abs/1611.03777 (2016).
[7]
Thomas Beck and Herbert Fischer. 1994. The if-problem in automatic differentiation. J. Comput. Appl. Math. 50, 1-3 (May 1994), 119–131.
[8]
James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: A CP U and GP U math expression compiler. In Proceedings of the Python for scientific computing conference (SciPy), Vol. 4.
[9]
Yves Bertot and Pierre Castéran. 2013. Interactive theorem proving and program development: Coq’Art: the calculus of inductive constructions. Springer Science & Business Media.
[10]
Richard Blute, Thomas Ehrhard, and Christine Tasson. 2010. A convenient differential category. arXiv preprint arXiv:1006.3140 (2010).
[11]
Richard F Blute, J Robin B Cockett, and Robert AG Seely. 2009. Cartesian differential categories. Theory and Applications of Categories 22, 23 (2009), 622–672.
[12]
Antonio Bucciarelli, Thomas Ehrhard, and Giulio Manzonetto. 2010. Categorical models for simply typed resource calculi. Electronic Notes in Theoretical Computer Science 265 (2010), 213–230.
[13]
Bruce Christianson. 2012. A Leibniz notation for automatic differentiation. In Recent Advances in Algorithmic Differentiation, Shaun Forth, Paul Hovland, Eric Phipps, Jean Utke, and Andrea Walther (Eds.). Lecture Notes in Computational Science and Engineering, Vol. 87. Springer, 1–9.
[14]
Frank H. Clarke. 1990. Optimization and nonsmooth analysis. Classics in Applied Mathematics, Vol. 5. SIAM.
[15]
J Robin B Cockett, Geoff SH Cruttwell, and Jonathan D Gallagher. 2011. Differential restriction categories. Theory and Applications of Categories 25, 21 (2011), 537–613.
[16]
Leonardo Mendonça de Moura, Soonho Kong, Jeremy Avigad, Floris van Doorn, and Jakob von Raumer. 2015. The Lean Theorem Prover (System Description). In Automated Deduction - CADE-25 - 25th International Conference on Automated Deduction, Berlin, Germany, August 1-7, 2015, Proceedings (Lecture Notes in Computer Science), Amy P. Felty and Aart Middeldorp (Eds.), Vol. 9195. Springer, 378–388.
[17]
Pietro Di Gianantonio and Abbas Edalat. 2013. A language for differentiable functions. In Foundations of Software Science and Computation Structures, Frank Pfenning (Ed.). Springer, 337–352.
[18]
Abbas Edalat and André Lieutier. 2004. Domain theory and differential calculus (functions of one variable). Mathematical Structures in Computer Science 14, 6 (2004), 771–802.
[19]
Abbas Edalat and Mehrdad Maleki. 2018. Differential calculus with imprecise input and its logical framework. In Foundations of Software Science and Computation Structures - 21st International Conference, FOSSACS 2018. 459–475.
[20]
Thomas Ehrhard and Laurent Regnier. 2003. The differential lambda-calculus. Theo. Comp. Sci. 309, 1-3 (2003), 1–41.
[21]
Conal Elliott. 2018. The simple essence of automatic differentiation. In Proceedings of the ACM on Programming Languages (ICFP).
[22]
Matthias Felleisen and Daniel P. Friedman. 1987. Control operators, the SECD-machine and the λ-calculus. In Formal Description of Programming Concepts III, M. Wirsing (Ed.). Elsevier, 193–217.
[23]
H. Fischer. 2001. Automatic differentiation: root problem and branch problem. In Encyclopedia of Optimization, C. A. Floudas and P. M. Pardalos (Eds.). Vol. I. Kluwer Academic Publishers, 118–122.
[24]
Roy Frostig, Matthew James Johnson, and Chris Leary. 2018. Compiling machine learning programs via high-level tracing. Presented at SysML 2018.
[25]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. MIT Press.
[26]
Andreas Griewank. 2000. Evaluating derivatives - principles and techniques of algorithmic differentiation. Frontiers in applied mathematics, Vol. 19. SIAM.
[27]
L. Hascoët and V. Pascual. 2013. The Tapenade automatic differentiation tool: principles, model, and specification. ACM Transactions On Mathematical Software 39, 3 (2013).
[28]
P. Iglesias-Zemmour. 2013. Diffeology. American Mathematical Society.
[29]
Andreas Kriegl and Peter W Michor. 1997. The convenient setting of global analysis. Vol. 53. American Mathematical Soc.
[30]
Dougal Maclaurin, David Duvenaud, and Ryan P Adams. 2015. Autograd: effortless gradients in Numpy. In ICML 2015 AutoML Workshop, Vol. 238.
[31]
Oleksandr Manzyuk. 2012. A simply typed λ-calculus of forward automatic differentiation. Electronic Notes in Theoretical Computer Science 286 (2012), 257 – 272.
[32]
Micaela Mayero. 2002. Using theorem proving for numerical analysis (correctness proof of an automatic differentiation algorithm). In Theorem Proving in Higher Order Logics, 15th International Conference, TPHOLs 2002, Hampton, VA, USA, August 20-23, 2002, Proceedings (Lecture Notes in Computer Science), Victor Carreño, César A. Muñoz, and Sofiène Tahar (Eds.), Vol. 2410. Springer, 246–262.
[33]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. 2019. PyTorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems. 8024–8035.
[34]
Barak A. Pearlmutter and Jeffrey Mark Siskind. 2008. Reverse-mode AD in a functional framework: lambda the ultimate backpropagator. ACM Trans. Program. Lang. Syst. 30, 2 (2008), 7:1–7:36.
[35]
Daniel Selsam, Percy Liang, and David L. Dill. 2017. Developing bug-free machine learning systems with formal mathematics. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017 (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, 3047–3056.
[36]
Amir Shaikhha, Andrew Fitzgibbon, Dimitrios Vytiniotis, Simon Peyton Jones, and Christoph Koch. 2018. Efficient differentiable programming in a functional array-processing language. CoRR abs/1806.02136 (2018). arXiv: 1806.02136
[37]
Jeffrey Mark Siskind and Barak A. Pearlmutter. 2005. Perturbation confusion and referential transparency: correct functional implementation of forward-mode AD. In Implementation and Application of Functional Languages—17th International Workshop, IFL’05, A. Butterfield (Ed.). 1–9. Trinity College Dublin, Computer Science Department Technical Report TCD-CS-2005-60.
[38]
Jeffrey Mark Siskind and Barak A. Pearlmutter. 2008. Nesting forward-mode AD in a functional framework. Higher-Order and Symbolic Computation 21, 4 (2008), 361–376.
[39]
Justin Slepak, Olin Shivers, and Panagiotis Manolios. 2014. An array-oriented language with static rank polymorphism. In Proceedings of the 23rd European Symposium on Programming Languages and Systems - Volume 8410. Springer-Verlag, 27–46.
[40]
Shuang Song, Kamalika Chaudhuri, and Anand D. Sarwate. 2013. Stochastic gradient descent with differentially private updates. In IEEE Global Conference on Signal and Information Processing, GlobalSIP 2013, Austin, TX, USA, December 3-5, 2013. IEEE, 245–248.
[41]
Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: a next-generation open source framework for deep learning. In Proceedings of Workshop on Machine Learning Systems (LearningSys) in the Twenty-Ninth Conference on Neural Information Processing Systems (NIPS), Vol. 5. 1–6.
[42]
W.F. Trench. 2003. Introduction to Real Analysis. Prentice Hall/Pearson Education.
[43]
M. Vákár, O. Kammar, and S. Staton. 2018. Diffeological spaces and semantics for differential programming. Presented at Domains XIII Workshop.
[44]
Bart van Merrienboer, Dan Moldovan, and Alexander B. Wiltschko. 2018. Tangent: automatic differentiation using sourcecode transformation for dynamically typed array programming. In Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada., S. Bengio, H. M. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.). 6259–6268.
[45]
Fei Wang, Xilun Wu, Grégory M. Essertel, James M. Decker, and Tiark Rompf. 2018. Demystifying differentiable programming: shift/reset the penultimate backpropagator. CoRR abs/1803.10228 (2018). arXiv: 1803.10228
[46]
Yuan Yu, Martín Abadi, Paul Barham, Eugene Brevdo, Mike Burrows, Andy Davis, Jeff Dean, Sanjay Ghemawat, Tim Harley, Peter Hawkins, Michael Isard, Manjunath Kudlur, Rajat Monga, Derek Murray, and Xiaoqiang Zheng. 2018. Dynamic control flow in large-scale machine learning. In Proceedings of the Thirteenth EuroSys Conference (EuroSys ’18). ACM, Article 18, 15 pages.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 4, Issue POPL
January 2020
1984 pages
EISSN:2475-1421
DOI:10.1145/3377388
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2019
Published in PACMPL Volume 4, Issue POPL

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. automatic differentiation
  2. differentiable programming

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)443
  • Downloads (Last 6 weeks)66
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media