Bures-Wasserstein Importance-Weighted Evidence Lower Bound: Exposition and Applications
Peiwen Jiang, Takuo Matsubara, Minh-Ngoc Tran

TL;DR
This paper introduces a novel geometric approach to optimize the Importance-Weighted Evidence Lower Bound (IW-ELBO) in Bures-Wasserstein space, improving gradient stability and efficiency for variational inference with Gaussian distributions.
Contribution
It formulates IW-ELBO optimization in Bures-Wasserstein space, derives the Wasserstein gradient, and proves improved SNR scaling, enhancing variational inference stability and performance.
Findings
Wasserstein gradient SNR scales as Ω(√K), unlike Euclidean gradients.
The proposed method outperforms baselines in approximation quality.
Extends analysis to Variational Rényi IW autoencoder.
Abstract
The Importance-Weighted Evidence Lower Bound (IW-ELBO) has emerged as an effective objective for variational inference (VI), tightening the standard ELBO and mitigating the mode-seeking behaviour. However, optimizing the IW-ELBO in Euclidean space is often inefficient, as its gradient estimators suffer from a vanishing signal-to-noise ratio (SNR). This paper formulates the optimisation of the IW-ELBO in Bures-Wasserstein space, a manifold of Gaussian distributions equipped with the 2-Wasserstein metric. We derive the Wasserstein gradient of the IW-ELBO and project it onto the Bures-Wasserstein space to yield a tractable algorithm for Gaussian VI. A pivotal contribution of our analysis concerns the stability of the gradient estimator. While the SNR of the standard Euclidean gradient estimator is known to vanish as the number of importance samples increases, we prove that the SNR…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning
