Goal-conditioned Imitation Learning

Yiming Ding; Carlos Florensa; Mariano Phielipp; Pieter Abbeel

arXiv:1906.05838·cs.LG·May 28, 2020·32 cites

Goal-conditioned Imitation Learning

Yiming Ding, Carlos Florensa, Mariano Phielipp, Pieter Abbeel

PDF

Open Access 1 Repo

TL;DR

This paper introduces a goal-conditioned imitation learning approach that leverages demonstrations to efficiently train policies capable of reaching diverse goals in robotics, reducing sample complexity and handling demonstrations without action data.

Contribution

The work presents a novel method integrating demonstrations into goal-conditioned RL, improving convergence speed and performance, even with actionless trajectories.

Findings

01

Speeds up policy learning compared to traditional HER.

02

Effective with demonstrations lacking action information.

03

Surpasses prior imitation learning methods in goal-reaching tasks.

Abstract

Designing rewards for Reinforcement Learning (RL) is challenging because it needs to convey the desired task, be efficient to optimize, and be easy to compute. The latter is particularly problematic when applying RL to robotics, where detecting whether the desired configuration is reached might require considerable supervision and instrumentation. Furthermore, we are often interested in being able to reach a wide range of configurations, hence setting up a different reward every time might be unpractical. Methods like Hindsight Experience Replay (HER) have recently shown promise to learn policies able to reach many goals, without the need of a reward. Unfortunately, without tricks like resetting to points along the trajectory, HER might require many samples to discover how to reach certain areas of the state-space. In this work we investigate different approaches to incorporate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dingyiming0427/goalgail
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Robot Manipulation and Learning

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Experience Replay