No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery
Alexander Rutherford, Michael Beukman, Timon Willi, Bruno Lacerda,, Nick Hawes, Jakob Foerster

TL;DR
This paper critically examines existing Unsupervised Environment Design methods in reinforcement learning, revealing they focus on success rate rather than regret, and proposes a new approach that emphasizes training on scenarios with high learnability to improve agent robustness.
Contribution
It identifies the mismatch between theoretical regret maximization and practical environment selection, and introduces a learnability-focused training method that outperforms existing UED techniques.
Findings
Existing UED methods correlate with success rate, not regret.
Training on learnable scenarios improves robustness in multiple environments.
Proposed method outperforms current UED approaches in Minigrid and robotics-inspired tasks.
Abstract
What data or environments to use for training to improve downstream performance is a longstanding and very topical question in reinforcement learning. In particular, Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks. This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics. Surprisingly, despite methods aiming to maximise regret in theory, the practical approximations do not correlate with regret but with success rate. As a result, a significant portion of an agent's experience comes from environments it has already mastered, offering little to no contribution toward enhancing its abilities. Put differently, current methods fail to predict intuitive measures of ``learnability.'' Specifically, they are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsEducational Assessment and Pedagogy · Educational Assessment and Improvement · Statistics Education and Methodologies
MethodsSoftmax · Attention Is All You Need
