Proximal Curriculum for Reinforcement Learning Agents
Georgios Tzannetos, B\'arbara Gomes Ribeiro, Parameswaran Kamalaruban,, Adish Singla

TL;DR
This paper introduces ProCuRL, a curriculum strategy inspired by ZPD, that dynamically selects tasks to optimize learning progress in reinforcement learning agents, demonstrating improved training efficiency across various domains.
Contribution
We propose ProCuRL, a theoretically grounded curriculum method inspired by ZPD, with a practical variant for deep RL that outperforms existing approaches.
Findings
ProCuRL accelerates training of deep RL agents.
ProCuRL outperforms state-of-the-art curriculum methods.
Theoretical analysis supports ProCuRL's effectiveness.
Abstract
We consider the problem of curriculum design for reinforcement learning (RL) agents in contextual multi-task settings. Existing techniques on automatic curriculum design typically require domain-specific hyperparameter tuning or have limited theoretical underpinnings. To tackle these limitations, we design our curriculum strategy, ProCuRL, inspired by the pedagogical concept of Zone of Proximal Development (ZPD). ProCuRL captures the intuition that learning progress is maximized when picking tasks that are neither too hard nor too easy for the learner. We mathematically derive ProCuRL by analyzing two simple learning settings. We also present a practical variant of ProCuRL that can be directly integrated with deep RL frameworks with minimal hyperparameter tuning. Experimental results on a variety of domains demonstrate the effectiveness of our curriculum strategy over state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Machine Learning and Data Classification
