On the Convergence of Stochastic Gradient MCMC Algorithms with High-Order Integrators
Changyou Chen, Nan Ding, Lawrence Carin

TL;DR
This paper develops a theoretical framework for stochastic gradient MCMC algorithms using high-order integrators, demonstrating faster convergence and improved accuracy over traditional methods, supported by experiments on synthetic and real data.
Contribution
It introduces a new theory for high-order integrators in SG-MCMC, showing they achieve faster convergence and more accurate invariant measures than first-order methods.
Findings
Higher-order integrators lead to faster convergence rates.
The 2nd-order symmetric splitting integrator improves mean square error.
Experiments confirm theoretical advantages in large-scale applications.
Abstract
Recent advances in Bayesian learning with large-scale data have witnessed emergence of stochastic gradient MCMC algorithms (SG-MCMC), such as stochastic gradient Langevin dynamics (SGLD), stochastic gradient Hamiltonian MCMC (SGHMC), and the stochastic gradient thermostat. While finite-time convergence properties of the SGLD with a 1st-order Euler integrator have recently been studied, corresponding theory for general SG-MCMCs has not been explored. In this paper we consider general SG-MCMCs with high-order integrators, and develop theory to analyze finite-time convergence properties and their asymptotic invariant measures. Our theoretical results show faster convergence rates and more accurate invariant measures for SG-MCMCs with higher-order integrators. For example, with the proposed efficient 2nd-order symmetric splitting integrator, the {\em mean square error} (MSE) of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks
