Distributionally Robust Multi-Task Reinforcement Learning via Adaptive Task Sampling
Nicholas E. Corrado, Wenyuan Huang, Josiah P. Hanna

TL;DR
This paper introduces DRATS, a new adaptive task sampling method for multi-task reinforcement learning that improves data efficiency and worst-task performance by focusing on tasks that are furthest from being solved.
Contribution
The paper proposes a novel distributionally robust adaptive task sampling algorithm that addresses data imbalance in multi-task reinforcement learning.
Findings
DRATS outperforms existing sampling algorithms on MetaWorld benchmarks.
DRATS improves data efficiency in multi-task reinforcement learning.
DRATS increases worst-task performance in benchmark tests.
Abstract
Multi-task reinforcement learning (MTRL) aims to train a single agent to efficiently optimize performance across multiple tasks simultaneously. However, jointly optimizing all tasks often yields imbalanced learning: agents quickly solve easy tasks but learn slowly on harder ones. While prior work primarily attributes this imbalance to conflicting task gradients and proposes gradient manipulation or specialized architectures to address it, we instead focus on a distinct and under-explored challenge: imbalanced data allocation. Standard MTRL allocates an equal number of environment interactions to each task, which over-allocates data to easy tasks that require relatively few interactions to solve and under-allocates data to hard tasks that require substantially more experience to solve. To address this challenge, we introduce Distributionally Robust Adaptive Task Sampling (DRATS), an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
