From EM to Data Augmentation: The Emergence of MCMC Bayesian Computation in the 1980s
Martin A. Tanner, Wing H. Wong

TL;DR
This paper explores the historical development of MCMC methods in Bayesian inference during the 1980s, highlighting how the integration of ideas from statistical physics and latent variable models led to widespread adoption.
Contribution
It analyzes the critical period when MCMC methods emerged in Bayesian statistics, emphasizing the role of auxiliary variables and collective moves in their adoption.
Findings
MCMC gained acceptance in Bayesian inference during 1980-1990.
Integration of statistical physics and latent variable models was key.
Introduction of auxiliary variables facilitated practical implementation.
Abstract
It was known from Metropolis et al. [J. Chem. Phys. 21 (1953) 1087--1092] that one can sample from a distribution by performing Monte Carlo simulation from a Markov chain whose equilibrium distribution is equal to the target distribution. However, it took several decades before the statistical community embraced Markov chain Monte Carlo (MCMC) as a general computational tool in Bayesian inference. The usual reasons that are advanced to explain why statisticians were slow to catch on to the method include lack of computing power and unfamiliarity with the early dynamic Monte Carlo papers in the statistical physics literature. We argue that there was a deeper reason, namely, that the structure of problems in the statistical mechanics and those in the standard statistical literature are different. To make the methods usable in standard Bayesian problems, one had to exploit the power that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
