Approximating Gradients for Differentiable Quality Diversity in Reinforcement Learning
Bryon Tjanaka, Matthew C. Fontaine, Julian Togelius, Stefanos, Nikolaidis

TL;DR
This paper explores how to effectively approximate gradients in differentiable quality diversity algorithms for reinforcement learning, enabling the training of diverse, high-performing agent policies in environments where exact gradients are unavailable.
Contribution
It introduces two gradient approximation variants for DQD algorithms in RL, demonstrating their effectiveness on locomotion tasks and highlighting current limitations.
Findings
One variant matches state-of-the-art results in QD and RL.
The other variant performs well on multiple locomotion tasks.
Results reveal limitations of current DQD algorithms with approximate gradients.
Abstract
Consider the problem of training robustly capable agents. One approach is to generate a diverse collection of agent polices. Training can then be viewed as a quality diversity (QD) optimization problem, where we search for a collection of performant policies that are diverse with respect to quantified behavior. Recent work shows that differentiable quality diversity (DQD) algorithms greatly accelerate QD optimization when exact gradients are available. However, agent policies typically assume that the environment is not differentiable. To apply DQD algorithms to training agent policies, we must approximate gradients for performance and behavior. We propose two variants of the current state-of-the-art DQD algorithm that compute gradients via approximation methods common in reinforcement learning (RL). We evaluate our approach on four simulated locomotion tasks. One variant achieves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
