Fixing a Broken ELBO

Alexander A. Alemi; Ben Poole; Ian Fischer; Joshua V. Dillon; Rif A.; Saurous; Kevin Murphy

arXiv:1711.00464·cs.LG·February 15, 2018·44 cites

Fixing a Broken ELBO

Alexander A. Alemi, Ben Poole, Ian Fischer, Joshua V. Dillon, Rif A., Saurous, Kevin Murphy

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the limitations of ELBO-based training in deep latent-variable models, proposing bounds on mutual information to better understand the tradeoffs and introducing a method to prevent latent code ignoring.

Contribution

It introduces a theoretical framework with bounds on mutual information and a new method to ensure meaningful latent representations in models with stochastic decoders.

Findings

01

Identifies models with identical ELBO but different characteristics.

02

Derives a rate-distortion curve for the compression-reconstruction tradeoff.

03

Proposes a simple method to prevent latent variable ignoring.

Abstract

Recent work in unsupervised representation learning has focused on learning deep directed latent-variable models. Fitting these models by maximizing the marginal likelihood or evidence is typically intractable, thus a common approximation is to maximize the evidence lower bound (ELBO) instead. However, maximum likelihood training (whether exact or approximate) does not necessarily result in a good latent representation, as we demonstrate both theoretically and empirically. In particular, we derive variational lower and upper bounds on the mutual information between the input and the latent variable, and use these bounds to derive a rate-distortion curve that characterizes the tradeoff between compression and reconstruction accuracy. Using this framework, we demonstrate that there is a family of models with identical ELBO, but different quantitative and qualitative characteristics. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

suvalaki/Deeper
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Gaussian Processes and Bayesian Inference · Machine Learning and Data Classification