Stochastic Gradient Hamiltonian Monte Carlo
Tianqi Chen, Emily B. Fox, Carlos Guestrin

TL;DR
This paper introduces a stochastic gradient variant of Hamiltonian Monte Carlo that effectively handles noisy gradient estimates in large-scale or streaming data scenarios, maintaining accurate sampling of the target distribution.
Contribution
It proposes a new stochastic gradient HMC method with a friction term based on Langevin dynamics, improving sampling stability with noisy gradients.
Findings
The proposed method maintains the target distribution as invariant.
Validated on simulated data and neural network classification tasks.
Effective in online Bayesian matrix factorization.
Abstract
Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals. The popularity of such methods has grown significantly in recent years. However, a limitation of HMC methods is the required gradient computation for simulation of the Hamiltonian dynamical system-such computation is infeasible in problems involving a large sample size or streaming data. Instead, we must rely on a noisy gradient estimate computed from a subset of the data. In this paper, we explore the properties of such a stochastic gradient HMC approach. Surprisingly, the natural implementation of the stochastic approximation can be arbitrarily bad. To address this problem we introduce a variant that uses second-order Langevin…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques
