Exemplar-free Continual Learning of Vision Transformers via Gated Class-Attention and Cascaded Feature Drift Compensation
Marco Cotogni, Fei Yang, Claudio Cusano, Andrew D. Bagdanov, Joost van, de Weijer

TL;DR
This paper introduces a novel exemplar-free continual learning method for Vision Transformers that uses gated class-attention and feature drift compensation to balance learning new tasks and retaining previous knowledge without needing past samples.
Contribution
It proposes gated class-attention and cascaded feature drift compensation techniques for exemplar-free ViT continual learning, avoiding the need for rehearsal samples.
Findings
Achieves competitive performance on CIFAR-100, Tiny-ImageNet, and ImageNet100.
Effectively limits catastrophic forgetting without exemplar replay.
Maintains task-agnostic inference capability.
Abstract
We propose a new method for exemplar-free class incremental training of ViTs. The main challenge of exemplar-free continual learning is maintaining plasticity of the learner without causing catastrophic forgetting of previously learned tasks. This is often achieved via exemplar replay which can help recalibrate previous task classifiers to the feature drift which occurs when learning new tasks. Exemplar replay, however, comes at the cost of retaining samples from previous tasks which for many applications may not be possible. To address the problem of continual ViT training, we first propose gated class-attention to minimize the drift in the final ViT transformer block. This mask-based gating is applied to class-attention mechanism of the last transformer block and strongly regulates the weights crucial for previous tasks. Importantly, gated class-attention does not require the task-ID…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Sparse and Compressive Sensing Techniques
