Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring
Sungjin Ahn (UC Irvine), Anoop Korattikara (UC Irvine), Max Welling, (UC Irvine)

TL;DR
This paper introduces a novel Bayesian sampling algorithm that efficiently approximates the posterior distribution using stochastic gradients, leveraging the Bayesian Central Limit Theorem to adapt to different mixing rates.
Contribution
It extends the SGLD algorithm by incorporating the Bayesian Central Limit Theorem, enabling high-rate normal approximation sampling and improved mixing behavior.
Findings
The proposed method achieves faster mixing rates compared to traditional SGLD.
It effectively approximates the Bayesian posterior with limited data access per sample.
The algorithm functions as an efficient optimizer during burn-in.
Abstract
In this paper we address the following question: Can we approximately sample from a Bayesian posterior distribution if we are only allowed to touch a small mini-batch of data-items for every sample we generate?. An algorithm based on the Langevin equation with stochastic gradients (SGLD) was previously proposed to solve this, but its mixing rate was slow. By leveraging the Bayesian Central Limit Theorem, we extend the SGLD algorithm so that at high mixing rates it will sample from a normal approximation of the posterior, while for slow mixing rates it will mimic the behavior of SGLD with a pre-conditioner matrix. As a bonus, the proposed algorithm is reminiscent of Fisher scoring (with stochastic gradients) and as such an efficient optimizer during burn-in.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Gaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models
