Leveraging Sequentiality in Reinforcement Learning from a Single   Demonstration

Alexandre Chenu; Olivier Serris; Olivier Sigaud; Nicolas; Perrin-Gilbert

arXiv:2211.04786·cs.RO·April 18, 2023

Leveraging Sequentiality in Reinforcement Learning from a Single Demonstration

Alexandre Chenu, Olivier Serris, Olivier Sigaud, Nicolas, Perrin-Gilbert

PDF

Open Access 1 Repo

TL;DR

This paper introduces DCIL-II, a novel algorithm that leverages sequential goals and a single demonstration to efficiently learn complex robotic control tasks, significantly reducing the need for multiple demonstrations.

Contribution

The paper presents DCIL-II, a new goal-conditioned reinforcement learning method that exploits sequentiality to learn complex tasks from a single demonstration with high sample efficiency.

Findings

01

Successfully applied to humanoid locomotion and stand-up tasks

02

Achieved unprecedented sample efficiency in simulated tasks

03

Enabled fast learning of complex robotic behaviors

Abstract

Deep Reinforcement Learning has been successfully applied to learn robotic control. However, the corresponding algorithms struggle when applied to problems where the agent is only rewarded after achieving a complex task. In this context, using demonstrations can significantly speed up the learning process, but demonstrations can be costly to acquire. In this paper, we propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration. To do so, our method learns a goal-conditioned policy to control a system between successive low-dimensional goals. This sequential goal-reaching approach raises a problem of compatibility between successive goals: we need to ensure that the state resulting from reaching a goal is compatible with the achievement of the following goals. To tackle this problem, we present a new algorithm called DCIL-II. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AlexandreChenu/DCIL_XPAG
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Locomotion and Control

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings