Content deleted Content added
No edit summary |
Correction Typo in the theorem hypothesis theta_0 to theta |
||
(7 intermediate revisions by 2 users not shown) | |||
Line 2:
{{Bayesian statistics}}
In [[Bayesian inference]], the '''Bernstein–von Mises theorem''' provides the basis for using Bayesian credible sets for confidence statements in [[parametric model|parametric models]]. It states that under some conditions, a posterior distribution converges in
:<math>||P(\theta|x_1,\dots x_n)
The Bernstein–von Mises theorem links [[Bayesian inference]] with [[frequentist inference]]. It assumes there is some true probabilistic process that generates the observations, as in frequentism, and then studies the quality of Bayesian methods of recovering that process, and making uncertainty statements about that process. In particular, it states that asymptotically, many Bayesian credible sets of a certain credibility level <math>\alpha</math> will
==Statement==
# The model admits densities <math>(p_\theta\,:\,\theta\in\Theta)</math> with respect to some measure <math>\mu</math>.
==Bernstein–von Mises and maximum likelihood estimation==▼
# The Fisher information matrix <math>\mathcal{I}(\theta_0)</math> is nonsingular.
# The model is differentiable in quadratic mean. That is, there exists a measurable function <math>f:\mathcal{X}\rightarrow\mathbb{R}^k</math> such that<math>\int\left[\sqrt{p_\theta(x)} - \sqrt{p_{\theta_0}(x)}- \frac{1}{2}(\theta - \theta_0)^\top f(x)\sqrt{p_{\theta_0}(x)}\right]^2 \mathrm{d}\mu(x) = o(||\theta - \theta_0||^2) </math> as <math>\theta \rightarrow \theta_0</math>.
# For every <math>\varepsilon > 0</math>, there exists a sequence of test functions <math>\phi_n:\mathcal{X}^n \rightarrow [0, 1]</math> such that <math>\mathbb{E}_{\mathbf{X} \sim P^n_{\theta_0}}\left[\phi_n(\mathbf{X})\right] \rightarrow 0</math> and <math>\sup_{\theta \,:\, ||\theta-\theta_0||>\varepsilon} \mathbb{E}_{\mathbf{X}\sim P^n_{\theta}}\left[1 - \phi_n(\mathbf{X})\right] \rightarrow 0</math> as <math>n \rightarrow \infty</math>.
# The prior measure is absolutely continuous with respect to the Lebesgue measure in a neighborhood of <math>\theta_0</math>, with a continuous positive density at <math>\theta_0</math>.
Then for any estimator <math>\widehat{\theta}_n</math> satisfying <math>\sqrt{n}( {\widehat{\theta}}_n - \theta_0) \xrightarrow{d} \mathcal{N}(0, {\mathcal{I}}^{-1}(\theta_0))</math>, the posterior distribution <math>\Pi_n</math> of <math>\theta\mid X_1, \ldots, X_n</math> satisfies<blockquote><math>{\left|\left|\Pi_n - \mathcal{N}\left(\widehat{\theta}_n, \frac{1}{n}{\mathcal{I}}^{-1}({\theta_0})\right)\right|\right|}_{\mathrm{TV}} \xrightarrow{P_{\theta_0}} 0.</math></blockquote>
as <math>n\rightarrow \infty</math>.
Under certain regularity conditions, the [[maximum likelihood estimator]] is an asymptotically efficient estimator and can thus be used as <math>\widehat{\theta}_n</math> in the theorem statement. This then yields that the posterior distribution converges in total variation distance to the asymptotic distribution of the maximum likelihood estimator, which is commonly used to construct frequentist confidence sets.
==Implications==
Line 28 ⟶ 37:
Different summary statistics such as the [[Mode (statistics)|mode]] and mean may behave differently in the posterior distribution. In Freedman's examples, the posterior density and its mean can converge on the wrong result, but the posterior mode is consistent and will converge on the correct result.
==
{{Reflist}}
==
*{{cite book|last=Hartigan |first=J. A. |authorlink=John A. Hartigan |chapter=Asymptotic Normality of Posterior Distributions |title=Bayes Theory |location=New York |publisher=Springer |year=1983 |isbn= |doi=10.1007/978-1-4613-8242-3_11 }}
*{{cite book |last=Le Cam |first=Lucien |authorlink=Lucien Le Cam |title=Asymptotic Methods in Statistical Decision Theory |chapter=Approximately Gaussian Posterior Distributions |pages=336–345 |location=New York |publisher=Springer |year=1986 |isbn=0-387-96307-3 }}
*{{cite book|last=van der Vaart|first=A. W. |title=Asymptotic Statistics
|year=1998|publisher=Cambridge University Press|isbn= 0-521-49603-9|chapter=Bernstein–von Mises Theorem}}
|