An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

Risto Vuorio; Jacob Beck; Shimon Whiteson; Jakob Foerster; Gregory; Farquhar

arXiv:2209.11303·cs.LG·September 26, 2022

An Investigation of the Bias-Variance Tradeoff in Meta-Gradients

Risto Vuorio, Jacob Beck, Shimon Whiteson, Jakob Foerster, Gregory, Farquhar

PDF

Open Access 1 Repo

TL;DR

This paper analyzes the bias-variance tradeoff in meta-gradient estimation for reinforcement learning, comparing methods like Hessian-based estimators, truncated backpropagation, and evolution strategies, especially in long-horizon settings.

Contribution

It provides a detailed empirical study disentangling bias and variance sources in meta-gradient estimators, highlighting limitations of Hessian-based methods and exploring alternatives.

Findings

01

Hessian estimators like DiCE introduce bias and variance.

02

Truncated backpropagation reduces bias but increases variance.

03

Evolution strategies offer a different tradeoff in long-horizon meta-learning.

Abstract

Meta-gradients provide a general approach for optimizing the meta-parameters of reinforcement learning (RL) algorithms. Estimation of meta-gradients is central to the performance of these meta-algorithms, and has been studied in the setting of MAML-style short-horizon meta-RL problems. In this context, prior work has investigated the estimation of the Hessian of the RL objective, as well as tackling the problem of credit assignment to pre-adaptation behavior by making a sampling correction. However, we show that Hessian estimation, implemented for example by DiCE and its variants, always adds bias and can also add variance to meta-gradient estimation. Meanwhile, meta-gradient estimation has been studied less in the important long-horizon setting, where backpropagation through the full inner optimization trajectories is not feasible. We study the bias and variance tradeoff arising from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vuoristo/meta-gradients
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Multi-Objective Optimization Algorithms · Model Reduction and Neural Networks