Re-parameterizing VAEs for stability
David Dehaene, R\'emy Brossard

TL;DR
This paper introduces a theoretically grounded re-parameterization method for VAEs that enhances training stability by addressing issues at the neural network and distribution interface, enabling more reliable deep VAE training.
Contribution
It provides a novel, theoretically justified re-parameterization technique for Normal distributions in VAEs to improve training stability and prevent NaN losses.
Findings
Re-parameterization reduces training instability in deep VAEs
Small parameterization changes prevent NaN losses
Method enhances training reliability on complex datasets
Abstract
We propose a theoretical approach towards the training numerical stability of Variational AutoEncoders (VAE). Our work is motivated by recent studies empowering VAEs to reach state of the art generative results on complex image datasets. These very deep VAE architectures, as well as VAEs using more complex output distributions, highlight a tendency to haphazardly produce high training gradients as well as NaN losses. The empirical fixes proposed to train them despite their limitations are neither fully theoretically grounded nor generally sufficient in practice. Building on this, we localize the source of the problem at the interface between the model's neural networks and their output probabilistic distributions. We explain a common source of instability stemming from an incautious formulation of the encoded Normal distribution's variance, and apply the same approach on other, less…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Image and Signal Denoising Methods
