Overcoming catastrophic forgetting in neural networks
James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness,, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho,, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan, Kumaran, Raia Hadsell

TL;DR
This paper presents a scalable method to prevent catastrophic forgetting in neural networks by selectively slowing down learning on important weights, enabling sequential learning of multiple tasks like MNIST classification and Atari games.
Contribution
The authors introduce a novel approach that maintains performance on old tasks while learning new ones by adjusting learning rates based on task importance.
Findings
Effective in preventing forgetting on MNIST and Atari tasks
Scalable approach suitable for complex sequential learning
Maintains high accuracy on previously learned tasks
Abstract
The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence. Neural networks are not, in general, capable of this and it has been widely thought that catastrophic forgetting is an inevitable feature of connectionist models. We show that it is possible to overcome this limitation and train networks that can maintain expertise on tasks which they have not experienced for a long time. Our approach remembers old tasks by selectively slowing down learning on the weights important for those tasks. We demonstrate our approach is scalable and effective by solving a set of classification tasks based on the MNIST hand written digit dataset and by learning several Atari 2600 games sequentially.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗infgrad/stella-base-zhmodel· 22 dl· ♡ 1422 dl♡ 14
- 🤗infgrad/stella-large-zhmodel· 30 dl· ♡ 2630 dl♡ 26
- 🤗infgrad/stella-large-zh-v2model· 4.6k dl· ♡ 324.6k dl♡ 32
- 🤗infgrad/stella-base-zh-v2model· 430 dl· ♡ 16430 dl♡ 16
- 🤗infgrad/stella-base-en-v2model· 15k dl· ♡ 1615k dl♡ 16
- 🤗Research2NLP/electrical_stellamodel· 14 dl14 dl
- 🤗jncraton/stella-base-en-v2-ct2-int8model· 33 dl33 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Human Pose and Action Recognition
MethodsElastic Weight Consolidation
