Robust Stochastic Gradient Posterior Sampling with Lattice Based Discretisation
Zier Mensch, Lars Holdijk, Samuel Duffield, Maxwell Aifer, Patrick J. Coles, Max Welling, Miranda C. N. Cheng

TL;DR
This paper introduces SGLRW, a robust stochastic gradient MCMC method that improves stability against minibatch size variations and gradient noise, outperforming traditional SGLD in challenging scenarios.
Contribution
The paper proposes SGLRW, a novel lattice-based stochastic gradient MCMC method that enhances robustness to gradient noise and minibatch size effects, with theoretical and empirical validation.
Findings
SGLRW is more stable than SGLD with heavy-tailed gradient noise.
SGLRW maintains asymptotic correctness despite robustness modifications.
Experimental results show improved predictive performance in Bayesian tasks.
Abstract
Stochastic-gradient MCMC methods enable scalable Bayesian posterior sampling but often suffer from sensitivity to minibatch size and gradient noise. To address this, we propose Stochastic Gradient Lattice Random Walk (SGLRW), an extension of the Lattice Random Walk discretization. Unlike conventional Stochastic Gradient Langevin Dynamics (SGLD), SGLRW introduces stochastic noise only through the off-diagonal elements of the update covariance; this yields greater robustness to minibatch size while retaining asymptotic correctness. Furthermore, as comparison we analyze a natural analogue of SGLD utilizing gradient clipping. Experimental validation on Bayesian regression and classification demonstrates that SGLRW remains stable in regimes where SGLD fails, including in the presence of heavy-tailed gradient noise, and matches or improves predictive performance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference
