Disentangling Variational Autoencoders
Rafael Pastrana

TL;DR
This paper investigates how to achieve disentangled latent spaces in variational autoencoders, enabling more controllable and interpretable data generation, by analyzing different VAE models trained on handwritten digit images.
Contribution
It provides an experimental analysis of latent space disentanglement in VAEs, identifying factors that improve interpretability and control over generated data.
Findings
Latent dimensions can be aligned with visual properties like line weight, tilt, and width.
Increasing KL divergence contribution improves disentanglement.
Conditioning on input class enhances latent space interpretability.
Abstract
A variational autoencoder (VAE) is a probabilistic machine learning framework for posterior inference that projects an input set of high-dimensional data to a lower-dimensional, latent space. The latent space learned with a VAE offers exciting opportunities to develop new data-driven design processes in creative disciplines, in particular, to automate the generation of multiple novel designs that are aesthetically reminiscent of the input data but that were unseen during training. However, the learned latent space is typically disorganized and entangled: traversing the latent space along a single dimension does not result in changes to single visual attributes of the data. The lack of latent structure impedes designers from deliberately controlling the visual attributes of new designs generated from the latent space. This paper presents an experimental study that investigates latent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Aesthetic Perception and Analysis · Cell Image Analysis Techniques
MethodsALIGN
