Training VAEs Under Structured Residuals
Gara Dorta, Sara Vicente, Lourdes Agapito, Neill D.F. Campbell, Ivor Simpson

TL;DR
This paper introduces a novel VAE architecture that models structured residual correlations using a Gaussian likelihood prediction network, improving the modeling of pixel dependencies in images.
Contribution
The paper presents a new VAE framework with covariance matrix prediction for structured residuals and a mechanism for structured uncertainty in color images.
Findings
Incorporating covariance prediction enhances residual modeling.
The architecture allows modeling of long-range pixel correlations.
Training scheme improves efficiency and residual structure capture.
Abstract
Variational auto-encoders (VAEs) are a popular and powerful deep generative model. Previous works on VAEs have assumed a factorized likelihood model, whereby the output uncertainty of each pixel is assumed to be independent. This approximation is clearly limited as demonstrated by observing a residual image from a VAE reconstruction, which often possess a high level of structure. This paper demonstrates a novel scheme to incorporate a structured Gaussian likelihood prediction network within the VAE that allows the residual correlations to be modeled. Our novel architecture, with minimal increase in complexity, incorporates the covariance matrix prediction within the VAE. We also propose a new mechanism for allowing structured uncertainty on color images. Furthermore, we provide a scheme for effectively training this model, and include some suggestions for improving performance in terms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
