Discouraging posterior collapse in hierarchical Variational Autoencoders using context
Anna Kuzina, Jakub M. Tomczak

TL;DR
This paper investigates posterior collapse in hierarchical VAEs and proposes a novel approach using a top-level context with Discrete Cosine Transform to improve latent space utilization without sacrificing generative quality.
Contribution
It introduces a hierarchical VAE with a top-level context via Discrete Cosine Transform to mitigate posterior collapse, a problem often overlooked in hierarchical models.
Findings
Improved latent space utilization observed
No negative impact on generative performance
Effective reduction of posterior collapse
Abstract
Hierarchical Variational Autoencoders (VAEs) are among the most popular likelihood-based generative models. There is a consensus that the top-down hierarchical VAEs allow effective learning of deep latent structures and avoid problems like posterior collapse. Here, we show that this is not necessarily the case, and the problem of collapsing posteriors remains. To discourage this issue, we propose a deep hierarchical VAE with a context on top. Specifically, we use a Discrete Cosine Transform to obtain the last latent variable. In a series of experiments, we observe that the proposed modification allows us to achieve better utilization of the latent space and does not harm the model's generative abilities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing
MethodsHierarchical Variational Autoencoder · Discrete Cosine Transform
