Multi-task Learning and Catastrophic Forgetting in Continual   Reinforcement Learning

Jo\~ao Ribeiro; Francisco S. Melo; Jo\~ao Dias

arXiv:1909.10008·cs.LG·September 24, 2019

Multi-task Learning and Catastrophic Forgetting in Continual Reinforcement Learning

Jo\~ao Ribeiro, Francisco S. Melo, Jo\~ao Dias

PDF

1 Repo

TL;DR

This study explores multi-task deep reinforcement learning, demonstrating that multi-task algorithms can outperform single-task ones on new tasks and that elastic weight consolidation helps mitigate catastrophic forgetting.

Contribution

It shows that multi-task deep RL can outperform single-task models on new tasks and that EWC effectively reduces forgetting in continual learning scenarios.

Findings

01

Multi-task GA3C outperforms single-task models on a new task.

02

EWC helps retain performance on previous tasks while learning new ones.

03

EWC mitigates catastrophic forgetting in multi-task reinforcement learning.

Abstract

In this paper we investigate two hypothesis regarding the use of deep reinforcement learning in multiple tasks. The first hypothesis is driven by the question of whether a deep reinforcement learning algorithm, trained on two similar tasks, is able to outperform two single-task, individually trained algorithms, by more efficiently learning a new, similar task, that none of the three algorithms has encountered before. The second hypothesis is driven by the question of whether the same multi-task deep RL algorithm, trained on two similar tasks and augmented with elastic weight consolidation (EWC), is able to retain similar performance on the new task, as a similar algorithm without EWC, whilst being able to overcome catastrophic forgetting in the two previous tasks. We show that a multi-task Asynchronous Advantage Actor-Critic (GA3C) algorithm, trained on Space Invaders and Demon Attack,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jmribeiro/UGP
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsElastic Weight Consolidation