Semi-Amortized Variational Autoencoders
Yoon Kim, Sam Wiseman, Andrew C. Miller, David Sontag, Alexander M., Rush

TL;DR
This paper introduces a semi-amortized variational autoencoder that combines amortized inference with local stochastic variational inference, improving generative modeling for text and images by avoiding posterior collapse.
Contribution
It proposes a hybrid inference method that refines amortized variational parameters with differentiable local SVI, enabling end-to-end training of richer models.
Findings
Outperforms autoregressive and variational baselines on text datasets
Reduces posterior collapse in training VAEs for text generation
Achieves better generative quality on image datasets
Abstract
Amortized variational inference (AVI) replaces instance-specific local inference with a global inference network. While AVI has enabled efficient training of deep generative models such as variational autoencoders (VAE), recent empirical work suggests that inference networks can produce suboptimal variational parameters. We propose a hybrid approach, to use AVI to initialize the variational parameters and run stochastic variational inference (SVI) to refine them. Crucially, the local SVI procedure is itself differentiable, so the inference network and generative model can be trained end-to-end with gradient-based optimization. This semi-amortized approach enables the use of rich generative models without experiencing the posterior-collapse phenomenon common in training VAEs for problems like text generation. Experiments show this approach outperforms strong autoregressive and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference
