Structured Stochastic Gradient MCMC
Antonios Alexos, Alex Boyd, Stephan Mandt

TL;DR
This paper introduces a non-parametric variational inference method combined with a Langevin-type algorithm that improves mixing speed and scalability for Bayesian neural networks, outperforming traditional SG-MCMC and VI methods.
Contribution
It proposes a novel non-parametric variational approximation and a Langevin-type algorithm that allows flexible dependency modeling and faster convergence in Bayesian inference.
Findings
Improved convergence speed over SG-MCMC and VI.
Achieved higher accuracy on ResNet-20 benchmarks.
Enhanced scalability with dropout modifications.
Abstract
Stochastic gradient Markov Chain Monte Carlo (SGMCMC) is considered the gold standard for Bayesian inference in large-scale models, such as Bayesian neural networks. Since practitioners face speed versus accuracy tradeoffs in these models, variational inference (VI) is often the preferable option. Unfortunately, VI makes strong assumptions on both the factorization and functional form of the posterior. In this work, we propose a new non-parametric variational approximation that makes no assumptions about the approximate posterior's functional form and allows practitioners to specify the exact dependencies the algorithm should respect or break. The approach relies on a new Langevin-type algorithm that operates on a modified energy function, where parts of the latent variables are averaged over samples from earlier iterations of the Markov chain. This way, statistical dependencies can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis
MethodsVariational Inference
