Inference Suboptimality in Variational Autoencoders
Chris Cremer, Xuechen Li, David Duvenaud

TL;DR
This paper investigates why variational autoencoders often produce suboptimal inference, highlighting the role of recognition network quality and how generator learning influences approximation expressiveness.
Contribution
It reveals that recognition network limitations, not distribution complexity, primarily cause inference suboptimality, and explores how generator training affects approximation expressiveness.
Findings
Recognition network imperfections cause divergence from true posterior.
Generator learning adapts to the recognition network, influencing inference quality.
Expressiveness parameters aid in generalizing inference beyond complexity improvements.
Abstract
Amortized inference allows latent-variable models trained via variational learning to scale to large datasets. The quality of approximate inference is determined by two factors: a) the capacity of the variational distribution to match the true posterior and b) the ability of the recognition network to produce good variational parameters for each datapoint. We examine approximate inference in variational autoencoders in terms of these factors. We find that divergence from the true posterior is often due to imperfect recognition networks, rather than the limited complexity of the approximating distribution. We show that this is due partly to the generator learning to accommodate the choice of approximation. Furthermore, we show that the parameters used to increase the expressiveness of the approximation play a role in generalizing inference rather than simply improving the complexity of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks
