CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
C\'edric Colas, Pierre Fournier, Olivier Sigaud, Mohamed Chetouani, Pierre-Yves Oudeyer

TL;DR
CURIOUS introduces a modular, intrinsically motivated reinforcement learning algorithm that autonomously develops a curriculum by focusing on goals with the highest learning progress, enabling robust, self-organized skill acquisition in complex environments.
Contribution
It presents a novel modular approach with automated curriculum learning based on intrinsic motivation, improving goal diversity and robustness in reinforcement learning.
Findings
Demonstrates self-organized curriculum development in robotic environments
Shows robustness to distracting goals and forgetting
Enables learning across a range of goal complexities
Abstract
In open-ended environments, autonomous learning agents must set their own goals and build their own curriculum through an intrinsically motivated exploration. They may consider a large diversity of goals, aiming to discover what is controllable in their environments, and what is not. Because some goals might prove easy and some impossible, agents must actively select which goal to practice at any moment, to maximize their overall mastery on the set of learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a modular Universal Value Function Approximator with hindsight learning to achieve a diversity of goals of different kinds within a unique policy and 2) an automated curriculum learning mechanism that biases the attention of the agent towards goals maximizing the absolute learning progress. Agents focus sequentially on goals of increasing complexity, and focus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Robot Manipulation and Learning
MethodsExperience Replay · Deep Deterministic Policy Gradient
