Outcome-directed Reinforcement Learning by Uncertainty & Temporal   Distance-Aware Curriculum Goal Generation

Daesol Cho; Seungjae Lee; H. Jin Kim

arXiv:2301.11741·cs.LG·February 21, 2023

Outcome-directed Reinforcement Learning by Uncertainty & Temporal Distance-Aware Curriculum Goal Generation

Daesol Cho, Seungjae Lee, H. Jin Kim

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces an uncertainty and temporal distance-aware curriculum goal generation method for outcome-directed reinforcement learning, significantly improving sample efficiency and goal proposal accuracy in complex navigation and robotic tasks.

Contribution

It presents a novel curriculum generation approach that provides calibrated guidance without prior domain knowledge, outperforming previous methods in challenging tasks.

Findings

01

Outperforms prior curriculum RL methods in navigation and robotic manipulation tasks.

02

Provides more accurate and geometry-agnostic curriculum goal proposals.

03

Enhances sample efficiency in outcome-directed reinforcement learning.

Abstract

Current reinforcement learning (RL) often suffers when solving a challenging exploration problem where the desired outcomes or high rewards are rarely observed. Even though curriculum RL, a framework that solves complex tasks by proposing a sequence of surrogate tasks, shows reasonable results, most of the previous works still have difficulty in proposing curriculum due to the absence of a mechanism for obtaining calibrated guidance to the desired outcome state without any prior domain knowledge. To alleviate it, we propose an uncertainty & temporal distance-aware curriculum goal generation method for the outcome-directed RL via solving a bipartite matching problem. It could not only provide precisely calibrated guidance of the curriculum to the desired outcome states but also bring much better sample efficiency and geometry-agnostic curriculum goal proposal capability compared to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jaylee0301/outpace_official
pytorchOfficial

Videos

Outcome-directed Reinforcement Learning by Uncertainty \& Temporal Distance-Aware Curriculum Goal Generation· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Path Planning Algorithms