Self-Paced Deep Reinforcement Learning
Pascal Klink, Carlo D'Eramo, Jan Peters, Joni Pajarinen

TL;DR
This paper introduces a method for automatic curriculum generation in reinforcement learning by framing it as an inference problem, leading to improved learning efficiency and stability across various environments.
Contribution
It proposes a novel inference-based approach for automatic curriculum generation that adapts to the agent, with strong theoretical backing and practical integration with deep RL.
Findings
Curricula generated improve learning speed and stability.
Method outperforms existing CRL algorithms in multiple environments.
Approach is easily integrated with deep RL algorithms.
Abstract
Curriculum reinforcement learning (CRL) improves the learning speed and stability of an agent by exposing it to a tailored series of tasks throughout learning. Despite empirical successes, an open question in CRL is how to automatically generate a curriculum for a given reinforcement learning (RL) agent, avoiding manual design. In this paper, we propose an answer by interpreting the curriculum generation as an inference problem, where distributions over tasks are progressively learned to approach the target task. This approach leads to an automatic curriculum generation, whose pace is controlled by the agent, with solid theoretical motivation and easily integrated with deep RL algorithms. In the conducted experiments, the curricula generated with the proposed algorithm significantly improve learning performance across several environments and deep RL algorithms, matching or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Machine Learning and Data Classification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
