TL;DR
BaRC introduces a curriculum learning approach using backward reachability and physical priors to significantly improve sample efficiency and training speed in goal-directed robotic reinforcement learning tasks.
Contribution
The paper proposes BaRC, a novel backward reachability curriculum that leverages approximate system dynamics to enhance model-free RL training in continuous control tasks.
Findings
Substantial performance improvements over previous curriculum methods.
Effective acceleration of training in robotic control problems.
General applicability to various model-free RL algorithms.
Abstract
Model-free Reinforcement Learning (RL) offers an attractive approach to learn control policies for high-dimensional systems, but its relatively poor sample complexity often forces training in simulated environments. Even in simulation, goal-directed tasks whose natural reward function is sparse remain intractable for state-of-the-art model-free algorithms for continuous control. The bottleneck in these tasks is the prohibitive amount of exploration required to obtain a learning signal from the initial state of the system. In this work, we leverage physical priors in the form of an approximate system dynamics model to design a curriculum scheme for a model-free policy optimization algorithm. Our Backward Reachability Curriculum (BaRC) begins policy training from states that require a small number of actions to accomplish the task, and expands the initial state distribution backwards in a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
