Neural Networks Remember More: The Power of Parameter Isolation and   Combination

Biqing Zeng; Zehan Li; Aladdin Ayesh

arXiv:2502.10966·cs.CL·February 18, 2025

Neural Networks Remember More: The Power of Parameter Isolation and Combination

Biqing Zeng, Zehan Li, Aladdin Ayesh

PDF

Open Access

TL;DR

This paper introduces a novel continual learning method for language models that uses parameter isolation and combination to balance stability and plasticity, significantly reducing catastrophic forgetting.

Contribution

It proposes a new approach combining parameter isolation and task arithmetic to improve knowledge retention in continual language learning.

Findings

01

Outperforms existing state-of-the-art methods on benchmarks

02

Effectively mitigates catastrophic forgetting

03

Enhances model stability without sacrificing plasticity

Abstract

Catastrophic forgetting is a pervasive issue for pre-trained language models (PLMs) during continual learning, where models lose previously acquired knowledge when sequentially trained on a series of tasks. The model's ability to retain old tasks is referred to as stability, while its adaptability to new tasks is called plasticity. Therefore, the key to solving this problem is to find a trade-off between the plasticity and stability of the model. To address this issue, in this paper, we propose a novel method to achieve a balance between model stability and plasticity, thereby mitigating catastrophic forgetting. More specifically, our proposed approach leverages parameter isolation and a subsequent combination strategy. Initially, in the training stage, the model adapts to each downstream task via a parameter isolation method to prevent potential interference among different tasks. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications