Continual HyperTransformer: A Meta-Learner for Continual Few-Shot   Learning

Max Vladymyrov; Andrey Zhmoginov; Mark Sandler

arXiv:2301.04584·cs.LG·August 20, 2024·5 cites

Continual HyperTransformer: A Meta-Learner for Continual Few-Shot Learning

Max Vladymyrov, Andrey Zhmoginov, Mark Sandler

PDF

Open Access

TL;DR

This paper introduces Continual HyperTransformer, a meta-learning approach that sequentially learns multiple tasks without forgetting by reusing generated weights as a memory of past tasks, avoiding replay buffers or regularization.

Contribution

It proposes a novel recursive weight re-use mechanism within a HyperTransformer framework for continual few-shot learning, enabling knowledge retention without traditional replay methods.

Findings

01

Effective in learning from mini-batches

02

Capable of task-incremental learning

03

Maintains performance on past tasks

Abstract

We focus on the problem of learning without forgetting from multiple tasks arriving sequentially, where each task is defined using a few-shot episode of novel or already seen classes. We approach this problem using the recently published HyperTransformer (HT), a Transformer-based hypernetwork that generates specialized task-specific CNN weights directly from the support set. In order to learn from a continual sequence of tasks, we propose to recursively re-use the generated weights as input to the HT for the next task. This way, the generated CNN weights themselves act as a representation of previously learned tasks, and the HT is trained to update these weights so that the new task can be learned without forgetting past tasks. This approach is different from most continual learning algorithms that typically rely on using replay buffers, weight regularization or task-dependent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning

MethodsHyperNetwork