Alpha-divergence Variational Inference Meets Importance Weighted Auto-Encoders: Methodology and Asymptotics
Kam\'elia Daudel, Joe Benton, Yuyang Shi, Arnaud Doucet

TL;DR
This paper introduces the VR-IWAE bound, a generalized variational bound that combines alpha-divergence minimization with importance weighted auto-encoders, providing unbiased gradient estimators and theoretical insights.
Contribution
It formalizes the VR-IWAE bound, demonstrating its properties and unbiased gradient estimation, and offers theoretical analyses comparing it to the standard IWAE bound.
Findings
VR-IWAE bound has desirable properties and matches VR bound in reparameterized cases.
Provides unbiased gradient estimators for the VR-IWAE bound.
Theoretical analyses clarify the advantages and limitations of IWAE and VR-IWAE bounds.
Abstract
Several algorithms involving the Variational R\'enyi (VR) bound have been proposed to minimize an alpha-divergence between a target posterior distribution and a variational distribution. Despite promising empirical results, those algorithms resort to biased stochastic gradient descent procedures and thus lack theoretical guarantees. In this paper, we formalize and study the VR-IWAE bound, a generalization of the Importance Weighted Auto-Encoder (IWAE) bound. We show that the VR-IWAE bound enjoys several desirable properties and notably leads to the same stochastic gradient descent procedure as the VR bound in the reparameterized case, but this time by relying on unbiased gradient estimators. We then provide two complementary theoretical analyses of the VR-IWAE bound and thus of the standard IWAE bound. Those analyses shed light on the benefits or lack thereof of these bounds. Lastly, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis · Statistical Methods and Inference
