Learning to Learn without Forgetting using Attention
Anna Vettoruzzo, Joaquin Vanschoren, Mohamed-Rafik Bouguelia,, Thorsteinn R\"ognvaldsson

TL;DR
This paper introduces a meta-learned, attention-based transformer optimizer designed to improve continual learning by selectively updating model parameters, thereby reducing forgetting and enhancing knowledge transfer across tasks.
Contribution
It presents a novel transformer-based meta-learning approach that learns effective weight updates to prevent catastrophic forgetting in continual learning scenarios.
Findings
Effective on benchmark datasets like SplitMNIST and SplitCIFAR-100
Improves both forward and backward transfer in continual learning
Performs well even with small labeled datasets
Abstract
Continual learning (CL) refers to the ability to continually learn over time by accommodating new knowledge while retaining previously learned experience. While this concept is inherent in human learning, current machine learning methods are highly prone to overwrite previously learned patterns and thus forget past experience. Instead, model parameters should be updated selectively and carefully, avoiding unnecessary forgetting while optimally leveraging previously learned patterns to accelerate future learning. Since hand-crafting effective update mechanisms is difficult, we propose meta-learning a transformer-based optimizer to enhance CL. This meta-learned optimizer uses attention to learn the complex relationships between model parameters across a stream of tasks, and is designed to generate effective weight updates for the current task while preventing catastrophic forgetting on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
MethodsSoftmax · Attention Is All You Need
