skip to main content
research-article
Open access

Denotational validation of higher-order Bayesian inference

Published: 27 December 2017 Publication History

Abstract

We present a modular semantic account of Bayesian inference algorithms for probabilistic programming languages, as used in data science and machine learning. Sophisticated inference algorithms are often explained in terms of composition of smaller parts. However, neither their theoretical justification nor their implementation reflects this modularity. We show how to conceptualise and analyse such inference algorithms as manipulating intermediate representations of probabilistic programs using higher-order functions and inductive types, and their denotational semantics.
Semantic accounts of continuous distributions use measurable spaces. However, our use of higher-order functions presents a substantial technical difficulty: it is impossible to define a measurable space structure over the collection of measurable functions between arbitrary measurable spaces that is compatible with standard operations on those functions, such as function application. We overcome this difficulty using quasi-Borel spaces, a recently proposed mathematical structure that supports both function spaces and continuous distributions.
We define a class of semantic structures for representing probabilistic programs, and semantic validity criteria for transformations of these representations in terms of distribution preservation. We develop a collection of building blocks for composing representations. We use these building blocks to validate common inference algorithms such as Sequential Monte Carlo and Markov Chain Monte Carlo. To emphasize the connection between the semantic manipulation and its traditional measure theoretic origins, we use Kock's synthetic measure theory. We demonstrate its usefulness by proving a quasi-Borel counterpart to the Metropolis-Hastings-Green theorem.

Supplementary Material

WEBM File (bayesianinference.webm)

References

[1]
R. J. Aumann. 1961. Borel structures for function spaces. Illinois Journal of Mathematics 5 (1961), 614–630.
[2]
Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2015. A Lambda-Calculus Foundation for Universal Probabilistic Programming. CoRR abs/1512.08990 (2015). http://arxiv.org/abs/1512.08990
[3]
Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016. 33–46.
[4]
Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language. Journal of Statistical Software, Articles 76, 1 (2017), 1–32.
[5]
Arnaud Doucet and Adam M. Johansen. 2011. A Tutorial on Particle Filtering and Smoothing: Fifteen years later. In The Oxford Handbook of Nonlinear Filtering, Dan Crisan and Boris Rozovskii (Eds.). Oxford University Press, 656–704.
[6]
Matthias Felleisen. 1991. On the Expressive Power of Programming Languages. Sci. Comput. Program. 17, 1-3 (1991), 35–75.
[7]
Marcelo Fiore and Philip Saville. 2017. List Objects with Algebraic Structure. In 2st International Conference on Formal Structures for Computation and Deduction, FSCD 2017.
[8]
Herman Geuvers and Erik Poll. 2007. Iteration and primitive recursion in categorical terms. In Reflections on Type Theory, Lambda Calculus, and the Mind, Essays Dedicated to Henk Barendregt on the Occasion of his 60th Birthday. Radboud Universiteit, Nijmegen, 101–114.
[9]
Charles J. Geyer. 2011. Introduction to Markov Chain Monte Carlo. In Handbook of Markov Chain Monte Carlo, Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng (Eds.). Chapman and Hall/CRC, Chapter 1, 3–48.
[10]
Noah Goodman, Vikash Mansinghka, Daniel M Roy, Keith Bonawitz, and Joshua B Tenenbaum. 2008. Church: a language for generative models. In UAI.
[11]
Noah Goodman and Andreas Stuhlmüller. 2014. Design and Implementation of Probabilistic Programming Languages. http://dippl.org . (2014).
[12]
Andrew D. Gordon, Thore Graepel, Nicolas Rolland, Claudio Russo, Johannes Borgstrom, and John Guiver. 2014. Tabular: A Schema-driven Probabilistic Programming Language. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). ACM, New York, NY, USA, 321–334.
[13]
Esfandiar Haghverdi and Philip Scott. 2006. A categorical model for the geometry of interaction. Theoretical Computer Science 350, 2 (2006), 252 – 274.
[14]
Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A Convenient Category for Higher-Order Probability Theory. In Proceedings of the 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’17, Reykjavik, Iceland, June 20-23, 2017.
[15]
Chung-Kil Hur, Aditya V. Nori, Sriram K. Rajamani, and Selva Samuel. 2015. A Provably Correct Sampler for Probabilistic Programs. In 35th IARCS Annual Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2015, December 16-18, 2015, Bangalore, India. 475–488.
[16]
Graham Hutton. 1999. A Tutorial on the Universality and Expressiveness of Fold. J. Funct. Program. 9, 4 (July 1999), 355–372.
[17]
Bart Jacobs. 2017. From Probability Monads to Commutative Effectuses. Journ. of Logical and Algebraic Methods in Programming (2017). To appear.
[18]
Mauro Jaskelioff. 2009. Lifting of Operations in Modular Monadic Semantics. Ph.D. Dissertation. University of Nottingham.
[19]
G M Kelly. 1980. A unified treatment of transfinite constructions for free algebras, free monoids, colimits, associated sheaves and so on. Bull. Austral. Math. Soc. 22 (1980), 1–83.
[20]
Anders Kock. 1972. Strong functors and monoidal monads. Archiv der Mathematik 23, 1 (1972), 113–120.
[21]
Anders Kock. 2012. Commutative monads as a theory of distributions. Theory and Applications of Categories 26, 4 (2012), 97–131.
[22]
Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David Blei. 2015. Automatic Variational Inference in Stan. In NIPS. https://papers.nips.cc/paper/5758-automatic-variational-inference-in-stan
[23]
Tuan Anh Le, Atilim Gunes Baydin, and Frank Wood. 2017. Inference Compilation and Universal Probabilistic Programming. In AISTATS. http://www.tuananhle.co.uk/assets/pdf/le2016inference.pdf
[24]
Vikash K. Mansinghka, Daniel Selsam, and Yura N. Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099 (2014). http://arxiv.org/abs/1404.0099
[25]
Francisco Marmolejo and Richard J. Wood. 2010. Monads as extension systems — no iteration is necessary. Theory and Applications of Categories 24, 4 (2010), 84–113.
[26]
T. Minka, J.M. Winn, J.P. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. 2014. Infer.NET 2.6. (2014). Microsoft Research Cambridge. http://research.microsoft.com/infernet.
[27]
Eugenio Moggi. 1989. Computational Lambda-Calculus and Monads. In LICS. IEEE Computer Society, USA, 14–23.
[28]
Lawrence M. Murray. 2013. Bayesian State-Space Modelling on High-Performance Hardware Using LibBi. arXiv:1306.3277. (2013).
[29]
Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In International Symposium on Functional and Logic Programming - 13th International Symposium, FLOPS 2016, Kochi, Japan, March 4-6, 2016, Proceedings. Springer, 62–79.
[30]
Sungwoo Park, Frank Pfenning, and Sebastian Thrun. 2005. A probabilistic language based upon sampling functions. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2005, Long Beach, California, USA, January 12-14, 2005. 171–182.
[31]
Maciej Piróg. 2016. Eilenberg-Moore Monoids and Backtracking Monad Transformers. In Proceedings 6th Workshop on Mathematically Structured Functional Programming, MSFP@ETAPS 2016, Eindhoven, Netherlands, 8th April 2016. (EPTCS), Robert Atkey and Neelakantan R. Krishnaswami (Eds.), Vol. 207. 23–56.
[32]
Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In Conference Record of POPL 2002: The 29th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, OR, USA, January 16-18, 2002. 154–165.
[33]
Adam Ścibior, Zoubin Ghahramani, and Andrew Gordon. 2015. Practical Probabilistic Programming with Monads. In Haskell. http://dl.acm.org/citation.cfm?id=2804317
[34]
Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proc. ESOP 2017.
[35]
Andrew Thomas, David J. Spiegelhalter, and W. R. Gilks. 1992. BUGS: A program to perform Bayesian inference using Gibbs sampling. Bayesian statistics 4 (1992), 837–842. Issue 9.
[36]
Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep Probabilistic Programming. In ICLR.
[37]
David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In AISTATS. https://web.stanford.edu/~ngoodman/papers/ lightweight-mcmc-aistats2011.pdf The published version contains a serious bug in the definition of alpha that was fixed in revision 3 available at the given URL.
[38]
David Wingate and Theophane Weber. 2013. Automated Variational Inference in Probabilistic Programming. arXiv:1301.1299. (2013).
[39]
Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In Proceedings of the 17th International conference on Artificial Intelligence and Statistics. 1024–1032.
[40]
Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In Proceedings of Uncertainty in Artificial Intelligence.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 2, Issue POPL
January 2018
1961 pages
EISSN:2475-1421
DOI:10.1145/3177123
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2017
Published in PACMPL Volume 2, Issue POPL

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Bayesian inference
  2. Kock integration
  3. applied category theory
  4. commutative monads
  5. initial algebra semantics
  6. quasi-Borel spaces
  7. sigma-monoids
  8. synthetic measure theory

Qualifiers

  • Research-article

Funding Sources

  • Engi- neering and Physical Sciences Research Council (ESPRC)
  • Royal Society
  • Institute for Information & Commu- nications Technology Promotion (IITP) grant funded by the Korea government (MSIP)
  • Balliol College Oxford
  • University College Oxford

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)161
  • Downloads (Last 6 weeks)20
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media