Generalized Doubly Reparameterized Gradient Estimators
Matthias Bauer, Andriy Mnih

TL;DR
This paper introduces two generalized DReGs gradient estimators that reduce variance in training hierarchical and conditional VAEs, improving efficiency and effectiveness in image modeling tasks.
Contribution
The authors extend DReGs to hierarchical models with multiple stochastic layers and to score functions of arbitrary distributions, broadening its applicability.
Findings
Enhanced training of hierarchical VAEs with multiple stochastic layers.
Effective gradient variance reduction in image modeling tasks.
Broader applicability of DReGs to prior and posterior parameters.
Abstract
Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders. Doubly-reparameterized gradients (DReGs) improve on the RT for multi-sample variational bounds by applying reparameterization a second time for an additional reduction in variance. Here, we develop two generalizations of the DReGs estimator and show that they can be used to train conditional and hierarchical VAEs on image modelling tasks more effectively. First, we extend the estimator to hierarchical models with several stochastic layers by showing how to treat additional score function terms due to the hierarchical variational posterior. We then generalize DReGs to score functions of arbitrary distributions instead of just those of the sampling distribution, which makes the estimator applicable to the parameters of the prior in addition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
