HyperMask: Adaptive Hypernetwork-based Masks for Continual Learning
Kamil Ksi\k{a}\.zek, Przemys{\l}aw Spurek

TL;DR
HyperMask introduces a hypernetwork-based method that dynamically generates task-specific sparse subnetworks for continual learning, effectively mitigating catastrophic forgetting and achieving state-of-the-art results.
Contribution
The paper proposes HyperMask, a novel approach using semi-binary masks and the lottery ticket hypothesis to create adaptive, task-specific subnetworks within a single network for continual learning.
Findings
HyperMask achieves competitive results on multiple CL datasets.
It surpasses state-of-the-art scores in some scenarios.
The method effectively mitigates catastrophic forgetting.
Abstract
Artificial neural networks suffer from catastrophic forgetting when they are sequentially trained on multiple tasks. Many continual learning (CL) strategies are trying to overcome this problem. One of the most effective is the hypernetwork-based approach. The hypernetwork generates the weights of a target model based on the task's identity. The model's main limitation is that, in practice, the hypernetwork can produce completely different architectures for subsequent tasks. To solve such a problem, we use the lottery ticket hypothesis, which postulates the existence of sparse subnetworks, named winning tickets, that preserve the performance of a whole network. In the paper, we propose a method called HyperMask, which dynamically filters a target network depending on the CL task. The hypernetwork produces semi-binary masks to obtain dedicated target subnetworks. Moreover, due to the…
Peer Reviews
Decision·Submitted to ICLR 2024
(1) The hypernetworks are used for continual learning in a different way of generating hypermasks for each task. (2) It connects to lottery ticket theory with a single network for continual learning. (3) The method is easy to follow in general.
(1) There are some works mentioned in the related work using masks as an extension of the whole network, it is unclear what benefits hypernetwork can bring. (2) There are several common loss functions are used in the method, and it is unclear if the improvements are from the proposed hypernetworks or the additional regularizations. There is no ablation study on these components. (3) The experimental evaluation is very limited. It only compares with other methods on very tiny benchmarks. The perf
1. The paper introduces an innovative approach, HyperMask, that combines the concepts of hypernetworks and the lottery ticket hypothesis. This unique combination leads to a novel method for addressing catastrophic forgetting in continual learning. 2. The idea of using semi-binary masks generated by a hypernetwork to create target subnetworks is a fresh and creative approach to tackling the challenges of continual learning. 3. The paper demonstrates a high level of quality in the experimental e
1.The primary contribution of HyperMask, which involves using hypernetworks to produce semi-binary masks for continual learning, may not be considered highly novel in the field of continual learning and neural network architectures. Hypernetworks have been explored in prior research as a means to generate task-specific weights for neural networks [1][2], and the concept of using masks or pruning for model adaptation is not entirely new. 2. The paper lacks a deeper theoretical analysis of the pro
- The method itself is technically reasonable and the details of the method are well described in the paper.
- Baselines suggested in this paper are relatively outdated. I understand that incremental architecture methods are no longer dominating this field of research but it does not mean that they can ignore regularization or representation based methods. For example, FeCAM [1] does not even need to know the task index and it still outperforms the author's method. - The amount of experimental result is severely insufficient. According to the paper, HyperMask is superior to other methods only in Split-
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInnovative Teaching and Learning Methods · Online and Blended Learning · Intelligent Tutoring Systems and Adaptive Learning
MethodsHyperNetwork
