Re-parameterizing VAEs for stability

David Dehaene; R\'emy Brossard

arXiv:2106.13739·cs.LG·June 28, 2021·5 cites

Re-parameterizing VAEs for stability

David Dehaene, R\'emy Brossard

PDF

Open Access

TL;DR

This paper introduces a theoretically grounded re-parameterization method for VAEs that enhances training stability by addressing issues at the neural network and distribution interface, enabling more reliable deep VAE training.

Contribution

It provides a novel, theoretically justified re-parameterization technique for Normal distributions in VAEs to improve training stability and prevent NaN losses.

Findings

01

Re-parameterization reduces training instability in deep VAEs

02

Small parameterization changes prevent NaN losses

03

Method enhances training reliability on complex datasets

Abstract

We propose a theoretical approach towards the training numerical stability of Variational AutoEncoders (VAE). Our work is motivated by recent studies empowering VAEs to reach state of the art generative results on complex image datasets. These very deep VAE architectures, as well as VAEs using more complex output distributions, highlight a tendency to haphazardly produce high training gradients as well as NaN losses. The empirical fixes proposed to train them despite their limitations are neither fully theoretically grounded nor generally sufficient in practice. Building on this, we localize the source of the problem at the interface between the model's neural networks and their output probabilistic distributions. We explain a common source of instability stemming from an incautious formulation of the encoded Normal distribution's variance, and apply the same approach on other, less…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Image and Signal Denoising Methods