Exclusive Supermask Subnetwork Training for Continual Learning
Prateek Yadav, Mohit Bansal

TL;DR
This paper introduces ExSSNeT, a novel continual learning approach that trains exclusive, non-overlapping subnetworks with knowledge transfer, outperforming previous methods while preventing forgetting across NLP and vision tasks.
Contribution
ExSSNeT is a new method that trains exclusive subnetworks to improve continual learning performance and incorporates a KNN-based knowledge transfer module.
Findings
ExSSNeT outperforms previous methods on NLP and vision tasks.
ExSSNeT achieves 8.3% improvement over SupSup with sparse masks.
ExSSNeT scales effectively to 100 tasks.
Abstract
Continual Learning (CL) methods focus on accumulating knowledge over time while avoiding catastrophic forgetting. Recently, Wortsman et al. (2020) proposed a CL method, SupSup, which uses a randomly initialized, fixed base network (model) and finds a supermask for each new task that selectively keeps or removes each weight to produce a subnetwork. They prevent forgetting as the network weights are not being updated. Although there is no forgetting, the performance of SupSup is sub-optimal because fixed weights restrict its representational power. Furthermore, there is no accumulation or transfer of knowledge inside the model when new tasks are learned. Hence, we propose ExSSNeT (Exclusive Supermask SubNEtwork Training), that performs exclusive and non-overlapping subnetwork weight training. This avoids conflicting updates to the shared weights by subsequent tasks to improve performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsBalanced Selection
