FCL-ViT: Task-Aware Attention Tuning for Continual Learning
Anestis Kaimakamidis, Ioannis Pitas

TL;DR
FCL-ViT introduces a feedback mechanism in Vision Transformers for continual learning, dynamically tuning attention features to adapt to new tasks while outperforming existing methods.
Contribution
The paper proposes FCL-ViT, a novel feedback-based Vision Transformer architecture with task-aware attention tuning for improved continual learning performance.
Findings
Surpasses state-of-the-art in continual learning benchmarks.
Uses fewer trainable parameters than existing methods.
Effectively adapts attention features to new tasks.
Abstract
Continual Learning (CL) involves adapting the prior Deep Neural Network (DNN) knowledge to new tasks, without forgetting the old ones. However, modern CL techniques focus on provisioning memory capabilities to existing DNN models rather than designing new ones that are able to adapt according to the task at hand. This paper presents the novel Feedback Continual Learning Vision Transformer (FCL-ViT) that uses a feedback mechanism to generate real-time dynamic attention features tailored to the current task. The FCL-ViT operates in two Phases. In phase 1, the generic image features are produced and determine where the Transformer should attend on the current image. In phase 2, task-specific image features are generated that leverage dynamic attention. To this end, Tunable self-Attention Blocks (TABs) and Task Specific Blocks (TSBs) are introduced that operate in both phases and are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning
MethodsAttention Is All You Need · Absolute Position Encodings · Adam · Softmax · Label Smoothing · Dropout · Dense Connections · Layer Normalization · Linear Layer · Multi-Head Attention
