The Fundamental Incompatibility of Hamiltonian Monte Carlo and Data   Subsampling

M. J. Betancourt

arXiv:1502.01510·stat.ME·February 6, 2015·20 cites

The Fundamental Incompatibility of Hamiltonian Monte Carlo and Data Subsampling

M. J. Betancourt

PDF

Open Access

TL;DR

Hamiltonian Monte Carlo is highly efficient for complex distributions, but data subsampling undermines its exploration capabilities, making it unsuitable for scalable, data-intensive applications.

Contribution

This paper proves that data subsampling inherently conflicts with Hamiltonian Monte Carlo's exploration efficiency, challenging its scalability in large data settings.

Findings

01

Data subsampling compromises Hamiltonian flow exploration.

02

Hamiltonian Monte Carlo's efficiency is incompatible with data subsampling.

03

Subsampling prevents scalable application of Hamiltonian Monte Carlo.

Abstract

Leveraging the coherent exploration of Hamiltonian flow, Hamiltonian Monte Carlo produces computationally efficient Monte Carlo estimators, even with respect to complex and high-dimensional target distributions. When confronted with data-intensive applications, however, the algorithm may be too expensive to implement, leaving us to consider the utility of approximations such as data subsampling. In this paper I demonstrate how data subsampling fundamentally compromises the efficient exploration of Hamiltonian flow and hence the scalable performance of Hamiltonian Monte Carlo itself.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Statistical Methods and Inference · Stochastic processes and statistical mechanics