Learning Disentangled Discrete Representations
David Friede, Christian Reimers, Heiner Stuckenschmidt, Mathias, Niepert

TL;DR
This paper investigates how discrete latent spaces in variational autoencoders can promote disentangled representations, showing that categorical VAEs serve as effective inductive priors and introducing an unsupervised model selection method.
Contribution
It demonstrates the benefits of discrete VAEs for disentanglement and proposes the first unsupervised strategy for selecting models with disentangled representations.
Findings
Discrete VAEs mitigate rotational invariance issues.
Categorical distributions act as efficient priors for disentanglement.
An unsupervised model selection method favors disentangled representations.
Abstract
Recent successes in image generation, model-based reinforcement learning, and text-to-image generation have demonstrated the empirical advantages of discrete latent representations, although the reasons behind their benefits remain unclear. We explore the relationship between discrete latent spaces and disentangled representations by replacing the standard Gaussian variational autoencoder (VAE) with a tailored categorical variational autoencoder. We show that the underlying grid structure of categorical distributions mitigates the problem of rotational invariance associated with multivariate Gaussian distributions, acting as an efficient inductive prior for disentangled representations. We provide both analytical and empirical findings that demonstrate the advantages of discrete VAEs for learning disentangled representations. Furthermore, we introduce the first unsupervised model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · AI in cancer detection · Domain Adaptation and Few-Shot Learning
