TL;DR
This paper introduces Success Induced Task Prioritization (SITP), an automatic curriculum learning framework that sequences tasks based on success rates to accelerate reinforcement learning.
Contribution
The paper proposes SITP, a novel method for automatic curriculum design in RL that dynamically prioritizes tasks based on performance, improving learning efficiency with minimal overhead.
Findings
SITP matches or surpasses existing curriculum methods in benchmarks.
SITP is easy to implement with minor modifications to standard RL frameworks.
SITP demonstrates effectiveness in POGEMA and Procgen environments.
Abstract
Many challenging reinforcement learning (RL) problems require designing a distribution of tasks that can be applied to train effective policies. This distribution of tasks can be specified by the curriculum. A curriculum is meant to improve the results of learning and accelerate it. We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning, where a task sequence is created based on the success rate of each task. In this setting, each task is an algorithmically created environment instance with a unique configuration. The algorithm selects the order of tasks that provide the fastest learning for agents. The probability of selecting any of the tasks for the next stage of learning is determined by evaluating its performance score in previous stages. Experiments were carried out in the Partially Observable Grid Environment for Multiple Agents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
