Continual HyperTransformer: A Meta-Learner for Continual Few-Shot Learning
Max Vladymyrov, Andrey Zhmoginov, Mark Sandler

TL;DR
This paper introduces Continual HyperTransformer, a meta-learning approach that sequentially learns multiple tasks without forgetting by reusing generated weights as a memory of past tasks, avoiding replay buffers or regularization.
Contribution
It proposes a novel recursive weight re-use mechanism within a HyperTransformer framework for continual few-shot learning, enabling knowledge retention without traditional replay methods.
Findings
Effective in learning from mini-batches
Capable of task-incremental learning
Maintains performance on past tasks
Abstract
We focus on the problem of learning without forgetting from multiple tasks arriving sequentially, where each task is defined using a few-shot episode of novel or already seen classes. We approach this problem using the recently published HyperTransformer (HT), a Transformer-based hypernetwork that generates specialized task-specific CNN weights directly from the support set. In order to learn from a continual sequence of tasks, we propose to recursively re-use the generated weights as input to the HT for the next task. This way, the generated CNN weights themselves act as a representation of previously learned tasks, and the HT is trained to update these weights so that the new task can be learned without forgetting past tasks. This approach is different from most continual learning algorithms that typically rely on using replay buffers, weight regularization or task-dependent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
MethodsHyperNetwork
