Stochastic Gradient MCMC with Stale Gradients
Changyou Chen, Nan Ding, Chunyuan Li, Yizhe Zhang and, Lawrence Carin

TL;DR
This paper analyzes the impact of stale gradients on stochastic gradient MCMC algorithms, showing that while bias and MSE are affected, the estimation variance remains unaffected, enabling scalable distributed Bayesian inference.
Contribution
The paper provides a theoretical analysis of SG-MCMC with stale gradients, revealing their effects on bias, MSE, and variance, and demonstrates linear speedup in distributed settings.
Findings
Bias and MSE depend on gradient staleness
Estimation variance is independent of staleness
Linear speedup in variance reduction with more workers
Abstract
Stochastic gradient MCMC (SG-MCMC) has played an important role in large-scale Bayesian learning, with well-developed theoretical convergence properties. In such applications of SG-MCMC, it is becoming increasingly popular to employ distributed systems, where stochastic gradients are computed based on some outdated parameters, yielding what are termed stale gradients. While stale gradients could be directly used in SG-MCMC, their impact on convergence properties has not been well studied. In this paper we develop theory to show that while the bias and MSE of an SG-MCMC algorithm depend on the staleness of stochastic gradients, its estimation variance (relative to the expected estimate, based on a prescribed number of samples) is independent of it. In a simple Bayesian distributed system with SG-MCMC, where stale gradients are computed asynchronously by a set of workers, our theory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Stochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference
