Exclusive Supermask Subnetwork Training for Continual Learning

Prateek Yadav; Mohit Bansal

arXiv:2210.10209·cs.CV·July 6, 2023

Exclusive Supermask Subnetwork Training for Continual Learning

Prateek Yadav, Mohit Bansal

PDF

Open Access 1 Repo

TL;DR

This paper introduces ExSSNeT, a novel continual learning approach that trains exclusive, non-overlapping subnetworks with knowledge transfer, outperforming previous methods while preventing forgetting across NLP and vision tasks.

Contribution

ExSSNeT is a new method that trains exclusive subnetworks to improve continual learning performance and incorporates a KNN-based knowledge transfer module.

Findings

01

ExSSNeT outperforms previous methods on NLP and vision tasks.

02

ExSSNeT achieves 8.3% improvement over SupSup with sparse masks.

03

ExSSNeT scales effectively to 100 tasks.

Abstract

Continual Learning (CL) methods focus on accumulating knowledge over time while avoiding catastrophic forgetting. Recently, Wortsman et al. (2020) proposed a CL method, SupSup, which uses a randomly initialized, fixed base network (model) and finds a supermask for each new task that selectively keeps or removes each weight to produce a subnetwork. They prevent forgetting as the network weights are not being updated. Although there is no forgetting, the performance of SupSup is sub-optimal because fixed weights restrict its representational power. Furthermore, there is no accumulation or transfer of knowledge inside the model when new tasks are learned. Hence, we propose ExSSNeT (Exclusive Supermask SubNEtwork Training), that performs exclusive and non-overlapping subnetwork weight training. This avoids conflicting updates to the shared weights by subsequent tasks to improve performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

prateeky2806/exessnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsBalanced Selection