Generalized Doubly Reparameterized Gradient Estimators

Matthias Bauer; Andriy Mnih

arXiv:2101.11046·stat.ML·July 14, 2021·1 cites

Generalized Doubly Reparameterized Gradient Estimators

Matthias Bauer, Andriy Mnih

PDF

Open Access 1 Video

TL;DR

This paper introduces two generalized DReGs gradient estimators that reduce variance in training hierarchical and conditional VAEs, improving efficiency and effectiveness in image modeling tasks.

Contribution

The authors extend DReGs to hierarchical models with multiple stochastic layers and to score functions of arbitrary distributions, broadening its applicability.

Findings

01

Enhanced training of hierarchical VAEs with multiple stochastic layers.

02

Effective gradient variance reduction in image modeling tasks.

03

Broader applicability of DReGs to prior and posterior parameters.

Abstract

Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders. Doubly-reparameterized gradients (DReGs) improve on the RT for multi-sample variational bounds by applying reparameterization a second time for an additional reduction in variance. Here, we develop two generalizations of the DReGs estimator and show that they can be used to train conditional and hierarchical VAEs on image modelling tasks more effectively. First, we extend the estimator to hierarchical models with several stochastic layers by showing how to treat additional score function terms due to the hierarchical variational posterior. We then generalize DReGs to score functions of arbitrary distributions instead of just those of the sampling distribution, which makes the estimator applicable to the parameters of the prior in addition…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generalized Doubly Reparameterized Gradient Estimators· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications