Diagnosing and Enhancing VAE Models

Bin Dai; David Wipf

arXiv:1903.05789·cs.LG·October 31, 2019·95 cites

Diagnosing and Enhancing VAE Models

Bin Dai, David Wipf

PDF

Open Access 4 Repos

TL;DR

This paper provides a detailed analysis of variational autoencoders, challenges common assumptions about their limitations, and introduces a simple enhancement that improves sample quality and stability, rivaling GANs without extra tuning.

Contribution

It offers a rigorous analysis of VAE objectives and proposes a hyperparameter-free enhancement that improves sample quality and stability.

Findings

01

Enhanced VAE produces crisp samples

02

Achieves stable FID scores comparable to GANs

03

No additional hyperparameters required

Abstract

Although variational autoencoders (VAEs) represent a widely influential deep generative model, many aspects of the underlying energy function remain poorly understood. In particular, it is commonly believed that Gaussian encoder/decoder assumptions reduce the effectiveness of VAEs in generating realistic samples. In this regard, we rigorously analyze the VAE objective, differentiating situations where this belief is and is not actually true. We then leverage the corresponding insights to develop a simple VAE enhancement that requires no additional hyperparameters or sensitive tuning. Quantitatively, this proposal produces crisp samples and stable FID scores that are actually competitive with a variety of GAN models, all while retaining desirable attributes of the original VAE architecture. A shorter version of this work will appear in the ICLR 2019 conference proceedings (Dai and Wipf,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Music and Audio Processing

MethodsConvolution · USD Coin Customer Service Number +1-833-534-1729 · Dogecoin Customer Service Number +1-833-534-1729