Lagging Inference Networks and Posterior Collapse in Variational   Autoencoders

Junxian He; Daniel Spokoyny; Graham Neubig; Taylor Berg-Kirkpatrick

arXiv:1901.05534·cs.LG·January 30, 2019·127 cites

Lagging Inference Networks and Posterior Collapse in Variational Autoencoders

Junxian He, Daniel Spokoyny, Graham Neubig, Taylor Berg-Kirkpatrick

PDF

Open Access 2 Repos

TL;DR

This paper identifies that the lagging inference network during early training causes posterior collapse in VAEs and proposes a simple, effective method to mitigate this by optimizing the inference network based on mutual information, improving performance.

Contribution

It introduces a straightforward training modification that reduces inference lag, preventing posterior collapse without adding complexity or new components to VAEs.

Findings

01

Outperforms autoregressive baselines on text and image benchmarks

02

Effectively avoids posterior collapse in VAEs

03

Achieves competitive results with faster training times

Abstract

The variational autoencoder (VAE) is a popular combination of deep latent variable model and accompanying variational learning technique. By using a neural inference network to approximate the model's posterior on latent variables, VAEs efficiently parameterize a lower bound on marginal data likelihood that can be optimized directly via gradient methods. In practice, however, VAE training often results in a degenerate local optimum known as "posterior collapse" where the model learns to ignore the latent variable and the approximate posterior mimics the prior. In this paper, we investigate posterior collapse from the perspective of training dynamics. We find that during the initial stages of training the inference network fails to approximate the model's true posterior, which is a moving target. As a result, the model is encouraged to ignore the latent encoding and posterior collapse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference

MethodsSolana Customer Service Number +1-833-534-1729 · USD Coin Customer Service Number +1-833-534-1729