Understanding disentangling in $\beta$-VAE
Christopher P. Burgess, Irina Higgins, Arka Pal, Loic Matthey, Nick, Watters, Guillaume Desjardins, Alexander Lerchner

TL;DR
This paper offers new theoretical insights into how disentangled representations emerge in $eta$-VAE, proposing a training modification that improves disentanglement without sacrificing reconstruction quality.
Contribution
It introduces a rate-distortion perspective to understand disentanglement and proposes a progressive capacity increase method for better $eta$-VAE training.
Findings
Disentanglement emerges under specific rate-distortion conditions.
Progressive capacity increase improves disentanglement and reconstruction.
Theoretical assessment clarifies conditions for disentangled representation emergence.
Abstract
We present new intuitions and theoretical assessments of the emergence of disentangled representation in variational autoencoders. Taking a rate-distortion theory perspective, we show the circumstances under which representations aligned with the underlying generative factors of variation of data emerge when optimising the modified ELBO bound in -VAE, as training progresses. From these insights, we propose a modification to the training regime of -VAE, that progressively increases the information capacity of the latent code during training. This modification facilitates the robust learning of disentangled representations in -VAE, without the previous trade-off in reconstruction accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Digital Media Forensic Detection
