Overcoming catastrophic forgetting with hard attention to the task
Joan Serr\`a, D\'idac Sur\'is, Marius Miron, Alexandros Karatzoglou

TL;DR
This paper introduces a task-based hard attention mechanism that significantly reduces catastrophic forgetting in neural networks during sequential learning, while maintaining robustness and offering control over learned knowledge.
Contribution
It proposes a novel hard attention method that preserves previous task information without hindering current learning, improving stability and compactness of knowledge.
Findings
Reduces catastrophic forgetting by 45-80%.
Robust to hyperparameter variations.
Enables monitoring and control of learned knowledge.
Abstract
Catastrophic forgetting occurs when a neural network loses the information learned in a previous task after training on subsequent tasks. This problem remains a hurdle for artificial intelligence systems with sequential learning capabilities. In this paper, we propose a task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning. A hard attention mask is learned concurrently to every task, through stochastic gradient descent, and previous masks are exploited to condition such learning. We show that the proposed mechanism is effective for reducing catastrophic forgetting, cutting current rates by 45 to 80%. We also show that it is robust to different hyperparameter choices, and that it offers a number of monitoring capabilities. The approach features the possibility to control both the stability and compactness of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications
