Backpropagation through the Void: Optimizing control variates for   black-box gradient estimation

Will Grathwohl; Dami Choi; Yuhuai Wu; Geoffrey Roeder; David Duvenaud

arXiv:1711.00123·cs.LG·February 27, 2018·100 cites

Backpropagation through the Void: Optimizing control variates for black-box gradient estimation

Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, David Duvenaud

PDF

Open Access 5 Repos

TL;DR

This paper presents a neural network-based framework for learning low-variance, unbiased gradient estimators applicable to black-box functions, improving optimization in settings like discrete latent models and reinforcement learning.

Contribution

It introduces a novel method for optimizing control variates to produce unbiased, low-variance gradient estimates for black-box functions, applicable to both discrete and continuous problems.

Findings

01

Effective in training discrete latent-variable models

02

Provides an unbiased, action-conditional extension of advantage actor-critic

03

Demonstrates improved gradient estimation in black-box optimization

Abstract

Gradient-based optimization is the foundation of deep learning and reinforcement learning. Even when the mechanism being optimized is unknown or not differentiable, optimization using high-variance or biased gradient estimates is still often the best strategy. We introduce a general framework for learning low-variance, unbiased gradient estimators for black-box functions of random variables. Our method uses gradients of a neural network trained jointly with model parameters or policies, and is applicable in both discrete and continuous settings. We demonstrate this framework for training discrete latent-variable models. We also give an unbiased, action-conditional extension of the advantage actor-critic reinforcement learning algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Model Reduction and Neural Networks