Tighter Variational Bounds are Not Necessarily Better
Tom Rainforth, Adam R. Kosiorek, Tuan Anh Le, Chris J. Maddison,, Maximilian Igl, Frank Wood, Yee Whye Teh

TL;DR
This paper challenges the assumption that tighter variational bounds always improve inference, showing they can harm gradient signal quality, and introduces new algorithms that outperform existing methods.
Contribution
It provides theoretical and empirical evidence that tighter ELBOs can be detrimental and proposes three new algorithms that outperform IWAE.
Findings
Tighter ELBOs can reduce gradient signal-to-noise ratio.
New algorithms (PIWAE, MIWAE, CIWAE) outperform IWAE.
PIWAE may improve training of both inference and generative networks.
Abstract
We provide theoretical and empirical evidence that using tighter evidence lower bounds (ELBOs) can be detrimental to the process of learning an inference network by reducing the signal-to-noise ratio of the gradient estimator. Our results call into question common implicit assumptions that tighter ELBOs are better variational objectives for simultaneous model learning and inference amortization schemes. Based on our insights, we introduce three new algorithms: the partially importance weighted auto-encoder (PIWAE), the multiply importance weighted auto-encoder (MIWAE), and the combination importance weighted auto-encoder (CIWAE), each of which includes the standard importance weighted auto-encoder (IWAE) as a special case. We show that each can deliver improvements over IWAE, even when performance is measured by the IWAE target itself. Furthermore, our results suggest that PIWAE may be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks
