Approximating Gradients for Differentiable Quality Diversity in   Reinforcement Learning

Bryon Tjanaka; Matthew C. Fontaine; Julian Togelius; Stefanos; Nikolaidis

arXiv:2202.03666·cs.LG·April 18, 2022·5 cites

Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning

Bryon Tjanaka, Matthew C. Fontaine, Julian Togelius, Stefanos, Nikolaidis

PDF

Open Access 1 Repo

TL;DR

This paper explores how to effectively approximate gradients in differentiable quality diversity algorithms for reinforcement learning, enabling the training of diverse, high-performing agent policies in environments where exact gradients are unavailable.

Contribution

It introduces two gradient approximation variants for DQD algorithms in RL, demonstrating their effectiveness on locomotion tasks and highlighting current limitations.

Findings

01

One variant matches state-of-the-art results in QD and RL.

02

The other variant performs well on multiple locomotion tasks.

03

Results reveal limitations of current DQD algorithms with approximate gradients.

Abstract

Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

icaros-usc/dqd-rl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics