Improved Stein Variational Gradient Descent with Importance Weights

Lukang Sun; Peter Richt\'arik

arXiv:2210.00462·cs.LG·November 22, 2022

Improved Stein Variational Gradient Descent with Importance Weights

Lukang Sun, Peter Richt\'arik

PDF

Open Access

TL;DR

This paper introduces $eta$-SVGD, an improved version of Stein Variational Gradient Descent that incorporates importance weights, leading to faster convergence and weaker dependence on initial distribution compared to traditional SVGD.

Contribution

The authors propose $eta$-SVGD, a novel enhancement of SVGD using importance weights, with theoretical convergence guarantees and empirical advantages.

Findings

01

$eta$-SVGD converges faster than SVGD in experiments.

02

Convergence time depends weakly on initial distribution.

03

Theoretical descent lemma established for $eta$-SVGD.

Abstract

Stein Variational Gradient Descent (SVGD) is a popular sampling algorithm used in various machine learning tasks. It is well known that SVGD arises from a discretization of the kernelized gradient flow of the Kullback-Leibler divergence $D_{K L} (\cdot ∣ π)$ , where $π$ is the target distribution. In this work, we propose to enhance SVGD via the introduction of importance weights, which leads to a new method for which we coin the name $β$ -SVGD. In the continuous time and infinite particles regime, the time for this flow to converge to the equilibrium distribution $π$ , quantified by the Stein Fisher information, depends on $ρ_{0}$ and $π$ very weakly. This is very different from the kernelized gradient flow of Kullback-Leibler divergence, whose time complexity depends on $D_{K L} (ρ_{0} ∣ π)$ . Under certain assumptions, we provide a descent lemma for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Privacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques