Lagging Inference Networks and Posterior Collapse in Variational Autoencoders
Junxian He, Daniel Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick

TL;DR
This paper identifies that the lagging inference network during early training causes posterior collapse in VAEs and proposes a simple, effective method to mitigate this by optimizing the inference network based on mutual information, improving performance.
Contribution
It introduces a straightforward training modification that reduces inference lag, preventing posterior collapse without adding complexity or new components to VAEs.
Findings
Outperforms autoregressive baselines on text and image benchmarks
Effectively avoids posterior collapse in VAEs
Achieves competitive results with faster training times
Abstract
The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique. By using a neural inference network to approximate the model's posterior on latent variables, VAEs efficiently parameterize a lower bound on marginal data likelihood that can be optimized directly via gradient methods. In practice, however, VAE training often results in a degenerate local optimum known as "posterior collapse" where the model learns to ignore the latent variable and the approximate posterior mimics the prior. In this paper, we investigate posterior collapse from the perspective of training dynamics. We find that during the initial stages of training the inference network fails to approximate the model's true posterior, which is a moving target. As a result, the model is encouraged to ignore the latent encoding and posterior collapse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference
MethodsSolana Customer Service Number +1-833-534-1729 · USD Coin Customer Service Number +1-833-534-1729
