Experience Replay for Continual Learning

David Rolnick; Arun Ahuja; Jonathan Schwarz; Timothy P. Lillicrap,; Greg Wayne

arXiv:1811.11682·cs.LG·November 27, 2019·376 cites

Experience Replay for Continual Learning

David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy P. Lillicrap,, Greg Wayne

PDF

Open Access

TL;DR

This paper investigates the use of experience replay buffers to mitigate catastrophic forgetting in continual reinforcement learning without explicit task boundary signals, demonstrating effectiveness in Atari and DMLab environments.

Contribution

It introduces a simple, general approach using experience replay buffers for continual learning that performs well without requiring task labels or boundaries.

Findings

01

Replay buffers reduce catastrophic forgetting effectively.

02

Limited buffer size with random data discarding performs nearly as well as unlimited buffers.

03

Method matches performance of task-aware approaches in reinforcement learning environments.

Abstract

Continual learning is the problem of learning new tasks or knowledge while protecting old knowledge and ideally generalizing from old experience to learn new tasks faster. Neural networks trained by stochastic gradient descent often degrade on old tasks when trained successively on new tasks with different data distributions. This phenomenon, referred to as catastrophic forgetting, is considered a major hurdle to learning with non-stationary data or sequences of new tasks, and prevents networks from continually accumulating knowledge and skills. We examine this issue in the context of reinforcement learning, in a setting where an agent is exposed to tasks in a sequence. Unlike most other work, we do not provide an explicit indication to the model of task boundaries, which is the most general circumstance for a learning agent exposed to continuous experience. While various methods to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Bandit Algorithms Research · Online Learning and Analytics

MethodsExperience Replay