Long-time asymptotics of noisy SVGD outside the population limit
Victor Priser (S2A, IDS), Pascal Bianchi (S2A, IDS), Adil Salim

TL;DR
This paper analyzes the long-term behavior of a noisy variant of Stein Variational Gradient Descent (SVGD), showing it converges to the target distribution and avoids variance collapse, with trajectories resembling a McKean-Vlasov process.
Contribution
It provides the first theoretical characterization of the asymptotic behavior of noisy SVGD and demonstrates its advantages over standard SVGD in avoiding variance collapse.
Findings
Noisy SVGD's limit set is well-defined and approaches the target distribution as iterations increase.
Noisy SVGD avoids the variance collapse observed in standard SVGD.
Trajectories of noisy SVGD resemble a McKean-Vlasov process.
Abstract
Stein Variational Gradient Descent (SVGD) is a widely used sampling algorithm that has been successfully applied in several areas of Machine Learning. SVGD operates by iteratively moving a set of interacting particles (which represent the samples) to approximate the target distribution. Despite recent studies on the complexity of SVGD and its variants, their long-time asymptotic behavior (i.e., after numerous iterations ) is still not understood in the finite number of particles regime. We study the long-time asymptotic behavior of a noisy variant of SVGD. First, we establish that the limit set of noisy SVGD for large is well-defined. We then characterize this limit set, showing that it approaches the target distribution as increases. In particular, noisy SVGD provably avoids the variance collapse observed for SVGD. Our approach involves demonstrating that the trajectories of noisy SVGD…
Peer Reviews
Decision·ICLR 2025 Poster
The theoretical understanding of SVGD has been lacking over the years. In particular, most existing convergence analyses are based on population limits. Instead, this paper attempts to study the asymptotic convergence behavior of finite particles, specifically by letting the number of iterations go to infinity and then letting the number of particles go to infinity. A step in this direction helps deepen the understanding of SVGD.
1. The main weakness is that the noisy version of SVGD deviates a lot from the SVGD used in practice. Presumably, the main reason for analyzing this particular version of noisy SVGD is to resolve certain technical challenges in the proof. 1. The coefficient of the Langevin dynamics controls how similar the noisy SVGD is to vanilla SVGD and Langevin dynamics. Large coefficients reduce to Langevin dynamics, whereas small coefficients reduce to vanilla SVGD. It is not clear whether there is a swee
The paper shows that noisy SVGD can properly sample a target distribution.
The results are not all that surprising, and I'm surprised earlier papers haven't proved this before. The background section may not have been thorough enough.
- The paper resolves a well-known issue of SVGD (variance collapse) by coming up with a modification that provably and empirically does not show variance collapse. The results of the paper are solid both in theory and practice. - The theory of the paper relies heavily on results on McKean-Vlasov equations, which are not part of the standard toolkit for machine learning researchers. This paper can serve as a guide to finite-particle convergence results for other particle algorithms.
- The paper would be really complete if the authors gave some guidance on how to appropriately set the Langevin regularization parameter, at least in the settings that they have studied. That would be useful for practitioners looking to replace SVGD or the Langevin algorithm by NSVGD.
Videos
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Random Matrices and Applications · Machine Learning and Algorithms
MethodsSparse Evolutionary Training
