Implicit Reparameterization Gradients

Michael Figurnov; Shakir Mohamed; Andriy Mnih

arXiv:1805.08498·cs.LG·January 31, 2019·63 cites

Implicit Reparameterization Gradients

Michael Figurnov, Shakir Mohamed, Andriy Mnih

PDF

Open Access 2 Repos

TL;DR

This paper introduces an implicit reparameterization gradient method that extends the reparameterization trick to a wider range of continuous distributions, improving efficiency and accuracy in gradient estimation.

Contribution

It presents a novel implicit differentiation approach for reparameterization gradients applicable to distributions like Gamma, Beta, Dirichlet, and von Mises, which are not suitable for the classic trick.

Findings

01

Faster gradient computation compared to existing methods.

02

More accurate gradient estimates for complex distributions.

03

Broader applicability of reparameterization in probabilistic models.

Abstract

By providing a simple and efficient way of computing low-variance gradients of continuous random variables, the reparameterization trick has become the technique of choice for training a variety of latent variable models. However, it is not applicable to a number of important continuous distributions. We introduce an alternative approach to computing reparameterization gradients based on implicit differentiation and demonstrate its broader applicability by applying it to Gamma, Beta, Dirichlet, and von Mises distributions, which cannot be used with the classic reparameterization trick. Our experiments show that the proposed approach is faster and more accurate than the existing gradient estimators for these distributions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Algorithms and Data Compression