BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning
Kishaan Jeeveswaran, Prashant Bhat, Bahram Zonooz, Elahe Arani

TL;DR
BiRT introduces a novel continual learning method for vision transformers that enhances representation rehearsal with constructive noises and consistency enforcement, improving performance and robustness across benchmarks.
Contribution
The paper proposes BiRT, a new representation rehearsal approach using vision transformers with constructive noises and consistency constraints for continual learning.
Findings
Outperforms raw image and vanilla representation rehearsal methods.
Achieves consistent performance gains on challenging benchmarks.
Robust to natural and adversarial corruptions.
Abstract
The ability of deep neural networks to continually learn and adapt to a sequence of tasks has remained challenging due to catastrophic forgetting of previously learned tasks. Humans, on the other hand, have a remarkable ability to acquire, assimilate, and transfer knowledge across tasks throughout their lifetime without catastrophic forgetting. The versatility of the brain can be attributed to the rehearsal of abstract experiences through a complementary learning system. However, representation rehearsal in vision transformers lacks diversity, resulting in overfitting and consequently, performance drops significantly compared to raw image rehearsal. Therefore, we propose BiRT, a novel representation rehearsal-based continual learning approach using vision transformers. Specifically, we introduce constructive noises at various stages of the vision transformer and enforce consistency in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention · Residual Connection · Dense Connections · Layer Normalization · Vision Transformer
