Stein Variational Gradient Descent: A General Purpose Bayesian Inference   Algorithm

Qiang Liu; Dilin Wang

arXiv:1608.04471·stat.ML·September 10, 2019·325 cites

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Qiang Liu, Dilin Wang

PDF

Open Access 5 Repos

TL;DR

This paper introduces Stein Variational Gradient Descent, a versatile Bayesian inference algorithm that transports particles to approximate target distributions by minimizing KL divergence, combining theoretical insights with empirical validation.

Contribution

It presents a novel variational inference method using functional gradient descent, connecting KL divergence derivatives with Stein's identity and kernelized Stein discrepancy.

Findings

01

Competitive performance on real-world datasets

02

Theoretical connection between KL derivatives and Stein's identity

03

Applicable to various Bayesian models

Abstract

We propose a general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization. Our method iteratively transports a set of particles to match the target distribution, by applying a form of functional gradient descent that minimizes the KL divergence. Empirical studies are performed on various real world models and datasets, on which our method is competitive with existing state-of-the-art methods. The derivation of our method is based on a new theoretical result that connects the derivative of KL divergence under smooth transforms with Stein's identity and a recently proposed kernelized Stein discrepancy, which is of independent interest.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Markov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis