Prototype Augmented Hypernetworks for Continual Learning

Neil De La Fuente; Maria Pilligua; Daniel Vidal; Albin Soutiff; Cecilia Curreli; Daniel Cremers; Andrey Barsky

arXiv:2505.07450·cs.LG·May 19, 2025

Prototype Augmented Hypernetworks for Continual Learning

Neil De La Fuente, Maria Pilligua, Daniel Vidal, Albin Soutiff, Cecilia Curreli, Daniel Cremers, Andrey Barsky

PDF

TL;DR

This paper introduces Prototype-Augmented Hypernetworks (PAH), a novel framework for continual learning that dynamically generates task-specific classifiers using prototypes, effectively reducing catastrophic forgetting and achieving state-of-the-art results.

Contribution

The paper presents PAH, a new hypernetwork-based approach that uses learnable prototypes to generate classifiers and employs dual distillation losses to prevent forgetting.

Findings

01

PAH achieves 74.5% accuracy on Split-CIFAR100.

02

PAH reaches 63.7% accuracy on TinyImageNet.

03

PAH significantly reduces forgetting to below 5%.

Abstract

Continual learning (CL) aims to learn a sequence of tasks without forgetting prior knowledge, but gradient updates for a new task often overwrite the weights learned earlier, causing catastrophic forgetting (CF). We propose Prototype-Augmented Hypernetworks (PAH), a framework where a single hypernetwork, conditioned on learnable task prototypes, dynamically generates task-specific classifier heads on demand. To mitigate forgetting, PAH combines cross-entropy with dual distillation losses, one to align logits and another to align prototypes, ensuring stable feature representations across tasks. Evaluations on Split-CIFAR100 and TinyImageNet demonstrate that PAH achieves state-of-the-art performance, reaching 74.5 % and 63.7 % accuracy with only 1.7 % and 4.4 % forgetting, respectively, surpassing prior methods without storing samples or heads.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsALIGN