Differentiable Weight Masks for Domain Transfer
Samar Khanna, Skanda Vaidyanath, Akash Velu

TL;DR
This paper investigates three weight masking methods in neural networks to balance retaining source task knowledge and efficiently adapting to new target tasks, addressing the challenge of modular information retention in deep learning.
Contribution
It combines modular weight masking techniques with domain transfer, analyzing their effectiveness in mitigating forgetting while fine-tuning on new tasks.
Findings
Different masking methods show trade-offs between source task retention and target task adaptation.
Some masks effectively prevent forgetting but may hinder target task performance.
The study provides insights into selecting masking strategies based on desired transfer outcomes.
Abstract
One of the major drawbacks of deep learning models for computer vision has been their inability to retain multiple sources of information in a modular fashion. For instance, given a network that has been trained on a source task, we would like to re-train this network on a similar, yet different, target task while maintaining its performance on the source task. Simultaneously, researchers have extensively studied modularization of network weights to localize and identify the set of weights culpable for eliciting the observed performance on a given task. One set of works studies the modularization induced in the weights of a neural network by learning and analysing weight masks. In this work, we combine these fields to study three such weight masking methods and analyse their ability to mitigate "forgetting'' on the source task while also allowing for efficient finetuning on the target…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
