Flexible and Efficient Long-Range Planning Through Curious Exploration
Aidan Curtis, Minjian Xin, Dilip Arumugam, Kevin Feigelis, Daniel, Yamins

TL;DR
The paper introduces CSP, a novel planning method combining curiosity-driven sampling and imitation learning, enabling efficient discovery of complex, long-range plans in realistic 3D tasks, surpassing traditional methods.
Contribution
It proposes CSP, a hybrid approach that integrates TAMP and DRL, to improve long-range planning efficiency and generalization in complex physical environments.
Findings
CSP efficiently discovers complex plans in 3D tasks.
CSP outperforms standard planning and learning methods.
CSP enables transfer learning across related tasks.
Abstract
Identifying algorithms that flexibly and efficiently discover temporally-extended multi-phase plans is an essential step for the advancement of robotics and model-based reinforcement learning. The core problem of long-range planning is finding an efficient way to search through the tree of possible action sequences. Existing non-learned planning solutions from the Task and Motion Planning (TAMP) literature rely on the existence of logical descriptions for the effects and preconditions for actions. This constraint allows TAMP methods to efficiently reduce the tree search problem but limits their ability to generalize to unseen and complex physical environments. In contrast, deep reinforcement learning (DRL) methods use flexible neural-network-based function approximators to discover policies that generalize naturally to unseen circumstances. However, DRL methods struggle to handle the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms
