An Attention-based Representation Distillation Baseline for Multi-Label Continual Learning
Martin Menabue, Emanuele Frascaroli, Matteo Boschini, Lorenzo, Bonicelli, Angelo Porrello, Simone Calderara

TL;DR
This paper introduces SCAD, a novel attention-based knowledge distillation method for multi-label continual learning, effectively mitigating catastrophic forgetting and outperforming existing methods on relevant datasets.
Contribution
It proposes a selective class attention distillation approach that aligns student and teacher representations, specifically addressing multi-label continual learning challenges.
Findings
SCAD outperforms current state-of-the-art CL methods on multi-label datasets.
Existing CL methods struggle with multi-label scenarios, highlighting the need for specialized approaches.
Selective transfer of relevant information improves continual learning performance.
Abstract
The field of Continual Learning (CL) has inspired numerous researchers over the years, leading to increasingly advanced countermeasures to the issue of catastrophic forgetting. Most studies have focused on the single-class scenario, where each example comes with a single label. The recent literature has successfully tackled such a setting, with impressive results. Differently, we shift our attention to the multi-label scenario, as we feel it to be more representative of real-world open problems. In our work, we show that existing state-of-the-art CL methods fail to achieve satisfactory performance, thus questioning the real advance claimed in recent years. Therefore, we assess both old-style and novel strategies and propose, on top of them, an approach called Selective Class Attention Distillation (SCAD). It relies on a knowledge transfer technique that seeks to align the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Machine Learning and Data Classification · Text and Document Classification Technologies
MethodsSoftmax · Attention Is All You Need · Class Attention · ALIGN
