TL;DR
This paper introduces Split Variational Autoencoders (SVAE), which decompose generated images into meaningful components using learned maps, improving generation quality and interpretability without extra loss functions.
Contribution
The paper proposes a novel SVAE model that automatically decomposes images into components via learned maps, enhancing generative performance without additional training constraints.
Findings
SVAE outperforms previous variational models on MNIST, CIFAR-10, and CelebA.
Decomposition schemes can be syntactic or semantic, affecting image quality.
The method improves FID scores by encouraging meaningful image splits.
Abstract
In this article we introduce the notion of Split Variational Autoencoder (SVAE), whose output is obtained as a weighted sum of two generated images , and is a {\em learned} compositional map. The composing images , as well as the -map are automatically synthesized by the model. The network is trained as a usual Variational Autoencoder with a negative loglikelihood loss between training and reconstructed images. No additional loss is required for or , neither any form of human tuning. The decomposition is nondeterministic, but follows two main schemes, that we may roughly categorize as either \say{syntactic} or \say{semantic}. In the first case, the map tends to exploit the strong correlation between adjacent pixels, splitting the image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
