PixelVAE: A Latent Variable Model for Natural Images
Ishaan Gulrajani, Kundan Kumar, Faruk Ahmed, Adrien Ali Taiga,, Francesco Visin, David Vazquez, Aaron Courville

TL;DR
PixelVAE combines the strengths of VAEs and PixelCNN to effectively model natural images, capturing both global structure and fine details with fewer autoregressive layers and hierarchical latent variables.
Contribution
It introduces PixelVAE, a novel model integrating VAE and PixelCNN, with hierarchical latent variables, achieving state-of-the-art image modeling performance.
Findings
State-of-the-art on binarized MNIST
Competitive on 64x64 ImageNet
High-quality samples on LSUN bedrooms
Abstract
Natural image modeling is a landmark challenge of unsupervised learning. Variational Autoencoders (VAEs) learn a useful latent representation and model global structure well but have difficulty capturing small details. PixelCNN models details very well, but lacks a latent code and is difficult to scale for capturing large structures. We present PixelVAE, a VAE model with an autoregressive decoder based on PixelCNN. Our model requires very few expensive autoregressive layers compared to PixelCNN and learns latent codes that are more compressed than a standard VAE while still capturing most non-trivial structure. Finally, we extend our model to a hierarchy of latent variables at different scales. Our model achieves state-of-the-art performance on binarized MNIST, competitive performance on 64x64 ImageNet, and high-quality samples on the LSUN bedrooms dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition · Cancer-related molecular mechanisms research
MethodsPixelCNN · USD Coin Customer Service Number +1-833-534-1729
