On Markov chain Monte Carlo methods for tall data
R\'emi Bardenet, Arnaud Doucet, Chris Holmes

TL;DR
This paper reviews existing scalable MCMC methods for large datasets, proposes a new subsampling approach with theoretical guarantees, and discusses its effectiveness under certain statistical assumptions, highlighting open challenges in less ideal scenarios.
Contribution
The paper provides a comprehensive review of scalable MCMC methods and introduces a novel subsampling-based approach with provable closeness to the posterior distribution under favorable conditions.
Findings
Proposed a subsampling method requiring less than O(n) likelihood evaluations in certain models.
Method performs well when the Bernstein-von Mises approximation is accurate.
Identified open challenges for scenarios with poor Bernstein-von Mises approximation.
Abstract
Markov chain Monte Carlo methods are often deemed too computationally intensive to be of any practical use for big data applications, and in particular for inference on datasets containing a large number of individual data points, also known as tall datasets. In scenarios where data are assumed independent, various approaches to scale up the Metropolis-Hastings algorithm in a Bayesian inference context have been recently proposed in machine learning and computational statistics. These approaches can be grouped into two categories: divide-and-conquer approaches and, subsampling-based algorithms. The aims of this article are as follows. First, we present a comprehensive review of the existing literature, commenting on the underlying assumptions and theoretical guarantees of each method. Second, by leveraging our understanding of these limitations, we propose an original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Statistical Methods and Inference · Bayesian Methods and Mixture Models
