Stein Variational Gradient Descent as Moment Matching
Qiang Liu, Dilin Wang

TL;DR
This paper provides a theoretical analysis of Stein Variational Gradient Descent (SVGD), revealing its moment matching properties and how kernel choices affect its inference accuracy, with implications for designing more efficient algorithms.
Contribution
It introduces the Stein matching set concept, analyzes non-asymptotic properties of SVGD, and explores how different kernels influence its inference capabilities.
Findings
SVGD with linear kernels exactly estimates means and variances for Gaussian distributions.
Random Fourier features enable probabilistic bounds for distributional approximation.
Theoretical framework connects SVGD properties to kernel choices and Stein's identity.
Abstract
Stein variational gradient descent (SVGD) is a non-parametric inference algorithm that evolves a set of particles to fit a given distribution of interest. We analyze the non-asymptotic properties of SVGD, showing that there exists a set of functions, which we call the Stein matching set, whose expectations are exactly estimated by any set of particles that satisfies the fixed point equation of SVGD. This set is the image of Stein operator applied on the feature maps of the positive definite kernel used in SVGD. Our results provide a theoretical framework for analyzing the properties of SVGD with different kernels, shedding insight into optimal kernel choice. In particular, we show that SVGD with linear kernels yields exact estimation of means and variances on Gaussian distributions, while random Fourier features enable probabilistic bounds for distributional approximation. Our results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Stochastic Gradient Optimization Techniques · Statistical Mechanics and Entropy
