research-article

Open access

Denotational validation of higher-order Bayesian inference

Authors:

Matthijs Vákár,

Klaus Ostermann,

Zoubin GhahramaniAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 2, Issue POPL

Article No.: 60, Pages 1 - 29

https://doi.org/10.1145/3158148

Published: 27 December 2017 Publication History

Abstract

We present a modular semantic account of Bayesian inference algorithms for probabilistic programming languages, as used in data science and machine learning. Sophisticated inference algorithms are often explained in terms of composition of smaller parts. However, neither their theoretical justification nor their implementation reflects this modularity. We show how to conceptualise and analyse such inference algorithms as manipulating intermediate representations of probabilistic programs using higher-order functions and inductive types, and their denotational semantics.

Semantic accounts of continuous distributions use measurable spaces. However, our use of higher-order functions presents a substantial technical difficulty: it is impossible to define a measurable space structure over the collection of measurable functions between arbitrary measurable spaces that is compatible with standard operations on those functions, such as function application. We overcome this difficulty using quasi-Borel spaces, a recently proposed mathematical structure that supports both function spaces and continuous distributions.

We define a class of semantic structures for representing probabilistic programs, and semantic validity criteria for transformations of these representations in terms of distribution preservation. We develop a collection of building blocks for composing representations. We use these building blocks to validate common inference algorithms such as Sequential Monte Carlo and Markov Chain Monte Carlo. To emphasize the connection between the semantic manipulation and its traditional measure theoretic origins, we use Kock's synthetic measure theory. We demonstrate its usefulness by proving a quasi-Borel counterpart to the Metropolis-Hastings-Green theorem.

Supplementary Material

WEBM File (bayesianinference.webm)

Download
139.06 MB

References

[1]

R. J. Aumann. 1961. Borel structures for function spaces. Illinois Journal of Mathematics 5 (1961), 614–630.

[2]

Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2015. A Lambda-Calculus Foundation for Universal Probabilistic Programming. CoRR abs/1512.08990 (2015). http://arxiv.org/abs/1512.08990

[3]

Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 2016. A lambda-calculus foundation for universal probabilistic programming. In Proceedings of the 21st ACM SIGPLAN International Conference on Functional Programming, ICFP 2016, Nara, Japan, September 18-22, 2016. 33–46.

Digital Library

[4]

Bob Carpenter, Andrew Gelman, Matthew Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. Stan: A Probabilistic Programming Language. Journal of Statistical Software, Articles 76, 1 (2017), 1–32.

[5]

Arnaud Doucet and Adam M. Johansen. 2011. A Tutorial on Particle Filtering and Smoothing: Fifteen years later. In The Oxford Handbook of Nonlinear Filtering, Dan Crisan and Boris Rozovskii (Eds.). Oxford University Press, 656–704.

[6]

Matthias Felleisen. 1991. On the Expressive Power of Programming Languages. Sci. Comput. Program. 17, 1-3 (1991), 35–75.

Digital Library

[7]

Marcelo Fiore and Philip Saville. 2017. List Objects with Algebraic Structure. In 2st International Conference on Formal Structures for Computation and Deduction, FSCD 2017.

[8]

Herman Geuvers and Erik Poll. 2007. Iteration and primitive recursion in categorical terms. In Reflections on Type Theory, Lambda Calculus, and the Mind, Essays Dedicated to Henk Barendregt on the Occasion of his 60th Birthday. Radboud Universiteit, Nijmegen, 101–114.

[9]

Charles J. Geyer. 2011. Introduction to Markov Chain Monte Carlo. In Handbook of Markov Chain Monte Carlo, Steve Brooks, Andrew Gelman, Galin L. Jones, and Xiao-Li Meng (Eds.). Chapman and Hall/CRC, Chapter 1, 3–48.

[10]

Noah Goodman, Vikash Mansinghka, Daniel M Roy, Keith Bonawitz, and Joshua B Tenenbaum. 2008. Church: a language for generative models. In UAI.

[11]

Noah Goodman and Andreas Stuhlmüller. 2014. Design and Implementation of Probabilistic Programming Languages. http://dippl.org . (2014).

[12]

Andrew D. Gordon, Thore Graepel, Nicolas Rolland, Claudio Russo, Johannes Borgstrom, and John Guiver. 2014. Tabular: A Schema-driven Probabilistic Programming Language. In Proceedings of the 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’14). ACM, New York, NY, USA, 321–334.

Digital Library

[13]

Esfandiar Haghverdi and Philip Scott. 2006. A categorical model for the geometry of interaction. Theoretical Computer Science 350, 2 (2006), 252 – 274.

Digital Library

[14]

Chris Heunen, Ohad Kammar, Sam Staton, and Hongseok Yang. 2017. A Convenient Category for Higher-Order Probability Theory. In Proceedings of the 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS ’17, Reykjavik, Iceland, June 20-23, 2017.

[15]

Chung-Kil Hur, Aditya V. Nori, Sriram K. Rajamani, and Selva Samuel. 2015. A Provably Correct Sampler for Probabilistic Programs. In 35th IARCS Annual Conference on Foundation of Software Technology and Theoretical Computer Science, FSTTCS 2015, December 16-18, 2015, Bangalore, India. 475–488.

[16]

Graham Hutton. 1999. A Tutorial on the Universality and Expressiveness of Fold. J. Funct. Program. 9, 4 (July 1999), 355–372.

Digital Library

[17]

Bart Jacobs. 2017. From Probability Monads to Commutative Effectuses. Journ. of Logical and Algebraic Methods in Programming (2017). To appear.

[18]

Mauro Jaskelioff. 2009. Lifting of Operations in Modular Monadic Semantics. Ph.D. Dissertation. University of Nottingham.

[19]

G M Kelly. 1980. A unified treatment of transfinite constructions for free algebras, free monoids, colimits, associated sheaves and so on. Bull. Austral. Math. Soc. 22 (1980), 1–83.

[20]

Anders Kock. 1972. Strong functors and monoidal monads. Archiv der Mathematik 23, 1 (1972), 113–120.

[21]

Anders Kock. 2012. Commutative monads as a theory of distributions. Theory and Applications of Categories 26, 4 (2012), 97–131.

[22]

Alp Kucukelbir, Rajesh Ranganath, Andrew Gelman, and David Blei. 2015. Automatic Variational Inference in Stan. In NIPS. https://papers.nips.cc/paper/5758-automatic-variational-inference-in-stan

[23]

Tuan Anh Le, Atilim Gunes Baydin, and Frank Wood. 2017. Inference Compilation and Universal Probabilistic Programming. In AISTATS. http://www.tuananhle.co.uk/assets/pdf/le2016inference.pdf

[24]

Vikash K. Mansinghka, Daniel Selsam, and Yura N. Perov. 2014. Venture: a higher-order probabilistic programming platform with programmable inference. arXiv:1404.0099 (2014). http://arxiv.org/abs/1404.0099

[25]

Francisco Marmolejo and Richard J. Wood. 2010. Monads as extension systems — no iteration is necessary. Theory and Applications of Categories 24, 4 (2010), 84–113.

[26]

T. Minka, J.M. Winn, J.P. Guiver, S. Webster, Y. Zaykov, B. Yangel, A. Spengler, and J. Bronskill. 2014. Infer.NET 2.6. (2014). Microsoft Research Cambridge. http://research.microsoft.com/infernet.

[27]

Eugenio Moggi. 1989. Computational Lambda-Calculus and Monads. In LICS. IEEE Computer Society, USA, 14–23.

[28]

Lawrence M. Murray. 2013. Bayesian State-Space Modelling on High-Performance Hardware Using LibBi. arXiv:1306.3277. (2013).

[29]

Praveen Narayanan, Jacques Carette, Wren Romano, Chung-chieh Shan, and Robert Zinkov. 2016. Probabilistic inference by program transformation in Hakaru (system description). In International Symposium on Functional and Logic Programming - 13th International Symposium, FLOPS 2016, Kochi, Japan, March 4-6, 2016, Proceedings. Springer, 62–79.

[30]

Sungwoo Park, Frank Pfenning, and Sebastian Thrun. 2005. A probabilistic language based upon sampling functions. In Proceedings of the 32nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2005, Long Beach, California, USA, January 12-14, 2005. 171–182.

Digital Library

[31]

Maciej Piróg. 2016. Eilenberg-Moore Monoids and Backtracking Monad Transformers. In Proceedings 6th Workshop on Mathematically Structured Functional Programming, MSFP@ETAPS 2016, Eindhoven, Netherlands, 8th April 2016. (EPTCS), Robert Atkey and Neelakantan R. Krishnaswami (Eds.), Vol. 207. 23–56.

[32]

Norman Ramsey and Avi Pfeffer. 2002. Stochastic lambda calculus and monads of probability distributions. In Conference Record of POPL 2002: The 29th SIGPLAN-SIGACT Symposium on Principles of Programming Languages, Portland, OR, USA, January 16-18, 2002. 154–165.

Digital Library

[33]

Adam Ścibior, Zoubin Ghahramani, and Andrew Gordon. 2015. Practical Probabilistic Programming with Monads. In Haskell. http://dl.acm.org/citation.cfm?id=2804317

[34]

Sam Staton. 2017. Commutative semantics for probabilistic programming. In Proc. ESOP 2017.

Digital Library

[35]

Andrew Thomas, David J. Spiegelhalter, and W. R. Gilks. 1992. BUGS: A program to perform Bayesian inference using Gibbs sampling. Bayesian statistics 4 (1992), 837–842. Issue 9.

[36]

Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, and David M. Blei. 2017. Deep Probabilistic Programming. In ICLR.

[37]

David Wingate, Andreas Stuhlmüller, and Noah Goodman. 2011. Lightweight Implementations of Probabilistic Programming Languages Via Transformational Compilation. In AISTATS. https://web.stanford.edu/~ngoodman/papers/ lightweight-mcmc-aistats2011.pdf The published version contains a serious bug in the definition of alpha that was fixed in revision 3 available at the given URL.

[38]

David Wingate and Theophane Weber. 2013. Automated Variational Inference in Probabilistic Programming. arXiv:1301.1299. (2013).

[39]

Frank Wood, Jan Willem van de Meent, and Vikash Mansinghka. 2014. A New Approach to Probabilistic Programming Inference. In Proceedings of the 17th International conference on Artificial Intelligence and Statistics. 1024–1032.

[40]

Robert Zinkov and Chung-chieh Shan. 2017. Composing inference algorithms as program transformations. In Proceedings of Uncertainty in Artificial Intelligence.

Cited By

Becker MLew AWang XGhavami MHuot MRinard MMansinghka V(2024)Probabilistic Programming with Programmable Variational InferenceProceedings of the ACM on Programming Languages10.1145/36564638:PLDI(2123-2147)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656463
Li JWang EZhang Y(2024)Compiling Probabilistic Programs for Variable Elimination with Information FlowProceedings of the ACM on Programming Languages10.1145/36564488:PLDI(1755-1780)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656448
Faggian CPautasso DVanoni G(2024)Higher Order Bayesian Networks, ExactlyProceedings of the ACM on Programming Languages10.1145/36329268:POPL(2514-2546)Online publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1145/3632926
Show More Cited By

Index Terms

Recommendations

Affine Monads and Lazy Structures for Bayesian Programming

We show that streams and lazy data structures are a natural idiom for programming with infinite-dimensional Bayesian methods such as Poisson processes, Gaussian processes, jump processes, Dirichlet processes, and Beta processes. The crucial semantic idea,...
Mean-field variational approximate Bayesian inference for latent variable models

The ill-posed nature of missing variable models offers a challenging testing ground for new computational techniques. This is the case for the mean-field variational Bayesian inference. The behavior of this approach in the setting of the Bayesian probit ...
Program logic for higher-order probabilistic programs in Isabelle/HOL
Abstract
The verification framework PPV (Probabilistic Program Verification) verifies functional probabilistic programs supporting higher-order functions, continuous distributions, and conditional inference. PPV is based on the theory of quasi-...
Highlights
- We formalize quasi-Borel spaces in Isabelle/HOL.
- We formalize the probability ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 2, Issue POPL

January 2018

1961 pages

EISSN:2475-1421

DOI:10.1145/3177123

Issue’s Table of Contents

Copyright © 2017 Owner/Author.

This work is licensed under a Creative Commons Attribution-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2017

Published in PACMPL Volume 2, Issue POPL

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Engi- neering and Physical Sciences Research Council (ESPRC)
Royal Society
Institute for Information & Commu- nications Technology Promotion (IITP) grant funded by the Korea government (MSIP)
Balliol College Oxford
University College Oxford

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
1,116
Total Downloads

Downloads (Last 12 months)161
Downloads (Last 6 weeks)20

Reflects downloads up to 10 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Becker MLew AWang XGhavami MHuot MRinard MMansinghka V(2024)Probabilistic Programming with Programmable Variational InferenceProceedings of the ACM on Programming Languages10.1145/36564638:PLDI(2123-2147)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656463
Li JWang EZhang Y(2024)Compiling Probabilistic Programs for Variable Elimination with Information FlowProceedings of the ACM on Programming Languages10.1145/36564488:PLDI(1755-1780)Online publication date: 20-Jun-2024
https://dl.acm.org/doi/10.1145/3656448
Faggian CPautasso DVanoni G(2024)Higher Order Bayesian Networks, ExactlyProceedings of the ACM on Programming Languages10.1145/36329268:POPL(2514-2546)Online publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1145/3632926
Dalloo AJaleel Humaidi AAl Mhdawi AAl-Raweshidy H(2024)Approximate Computing: Concepts, Architectures, Challenges, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.346737512(146022-146088)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3467375
Lundén DHummelgren LKudlicka JEriksson OBroman D(2024)Suspension Analysis and Selective Continuation-Passing Style for Universal Probabilistic Programming LanguagesProgramming Languages and Systems10.1007/978-3-031-57267-8_12(302-330)Online publication date: 5-Apr-2024
https://doi.org/10.1007/978-3-031-57267-8_12
Sennesh Evan de Meent J(2023)String Diagrams with Factorized DensitiesElectronic Proceedings in Theoretical Computer Science10.4204/EPTCS.397.16397(260-278)Online publication date: 14-Dec-2023
https://doi.org/10.4204/EPTCS.397.16
Nguyen MPerera RWang MRamsay SMcDonell TVazou N(2023)Effect Handlers for Programmable InferenceProceedings of the 16th ACM SIGPLAN International Haskell Symposium10.1145/3609026.3609729(44-58)Online publication date: 30-Aug-2023
https://dl.acm.org/doi/10.1145/3609026.3609729
Lew AGhavamizadeh MRinard MMansinghka V(2023)Probabilistic Programming with Stochastic ProbabilitiesProceedings of the ACM on Programming Languages10.1145/35912907:PLDI(1708-1732)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591290
Tassarotti JTristan J(2023)Verified Density Compilation for a Probabilistic Programming LanguageProceedings of the ACM on Programming Languages10.1145/35912457:PLDI(615-637)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591245
Lew AHuot MStaton SMansinghka V(2023)ADEV: Sound Automatic Differentiation of Expected Values of Probabilistic ProgramsProceedings of the ACM on Programming Languages10.1145/35711987:POPL(121-153)Online publication date: 11-Jan-2023
https://dl.acm.org/doi/10.1145/3571198
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents