AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
Ruqi Zhang, A. Feder Cooper, Christopher De Sa

TL;DR
AMAGOLD is a new stochastic gradient MCMC method that efficiently reduces bias using infrequent Metropolis-Hastings corrections, enabling fixed step size convergence with practical benefits for Bayesian inference tasks.
Contribution
The paper introduces AMAGOLD, a second-order SG-MCMC algorithm that amortizes Metropolis-Hastings corrections, allowing fixed step size convergence and improved efficiency over existing methods.
Findings
AMAGOLD converges to the target distribution with a fixed step size.
It has a convergence rate at most a constant factor slower than full-batch methods.
Empirical results show effectiveness on synthetic and real-world Bayesian models.
Abstract
Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient method for sampling from continuous distributions. It is a faster alternative to HMC: instead of using the whole dataset at each iteration, SGHMC uses only a subsample. This improves performance, but introduces bias that can cause SGHMC to converge to the wrong distribution. One can prevent this using a step size that decays to zero, but such a step size schedule can drastically slow down convergence. To address this tension, we propose a novel second-order SG-MCMC algorithm---AMAGOLD---that infrequently uses Metropolis-Hastings (M-H) corrections to remove bias. The infrequency of corrections amortizes their cost. We prove AMAGOLD converges to the target distribution with a fixed, rather than a diminishing, step size, and that its convergence rate is at most a constant factor slower than a full-batch baseline. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques
