DisCoRL: Continual Reinforcement Learning via Policy Distillation

Ren\'e Traor\'e; Hugo Caselles-Dupr\'e; Timoth\'ee Lesort; Te Sun,; Guanghang Cai; Natalia D\'iaz-Rodr\'iguez; David Filliat

arXiv:1907.05855·cs.LG·July 15, 2019·35 cites

DisCoRL: Continual Reinforcement Learning via Policy Distillation

Ren\'e Traor\'e, Hugo Caselles-Dupr\'e, Timoth\'ee Lesort, Te Sun,, Guanghang Cai, Natalia D\'iaz-Rodr\'iguez, David Filliat

PDF

Open Access

TL;DR

DisCoRL introduces a method combining state representation learning and policy distillation to enable continual reinforcement learning, allowing agents to learn multiple tasks sequentially without forgetting and to infer the current task automatically.

Contribution

The paper presents DisCoRL, a novel approach that effectively addresses continual RL challenges by integrating state representation learning with policy distillation.

Findings

01

Successfully learned multiple navigation tasks sequentially

02

Automatically inferred the active task without external signals

03

Transferred policies from simulation to real-world robot

Abstract

In multi-task reinforcement learning there are two main challenges: at training time, the ability to learn different policies with a single model; at test time, inferring which of those policies applying without an external signal. In the case of continual reinforcement learning a third challenge arises: learning tasks sequentially without forgetting the previous ones. In this paper, we tackle these challenges by proposing DisCoRL, an approach combining state representation learning and policy distillation. We experiment on a sequence of three simulated 2D navigation tasks with a 3 wheel omni-directional robot. Moreover, we tested our approach's robustness by transferring the final policy into a real life setting. The policy can solve all tasks and automatically infer which one to run.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications