Revisiting Weight Regularization for Low-Rank Continual Learning

Yaoyue Zheng; Yin Zhang; Joost van de Weijer; Gido M van de Ven; Shaoyi Du; Xuetao Zhang; Zhiqiang Tian

arXiv:2602.17559·cs.LG·February 20, 2026

Revisiting Weight Regularization for Low-Rank Continual Learning

Yaoyue Zheng, Yin Zhang, Joost van de Weijer, Gido M van de Ven, Shaoyi Du, Xuetao Zhang, Zhiqiang Tian

PDF

Open Access 3 Reviews

TL;DR

This paper introduces EWC-LoRA, a novel low-rank regularization method for continual learning with pre-trained models, effectively mitigating task interference while maintaining low storage and inference costs.

Contribution

It revisits weight regularization in low-rank continual learning, proposing a shared low-rank regularization approach that improves stability and plasticity trade-offs.

Findings

01

EWC-LoRA outperforms existing low-rank CL methods in benchmarks.

02

The method maintains constant storage and inference costs regardless of task number.

03

Weight regularization remains effective in low-rank parameterizations for CL.

Abstract

Continual Learning (CL) with large-scale pre-trained models (PTMs) has recently gained wide attention, shifting the focus from training from scratch to continually adapting PTMs. This has given rise to a promising paradigm: parameter-efficient continual learning (PECL), where task interference is typically mitigated by assigning a task-specific module during training, such as low-rank adapters. However, weight regularization techniques, such as Elastic Weight Consolidation (EWC)-a key strategy in CL-remain underexplored in this new paradigm. In this paper, we revisit weight regularization in low-rank CL as a new perspective for mitigating task interference in PECL. Unlike existing low-rank CL methods, we mitigate task interference by regularizing a shared low-rank update through EWC, thereby keeping the storage requirement and inference costs constant regardless of the number of tasks.…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

1. The paper is well-organized and easy to follow. 2. The investigated problem of continual learning using low-rank adaptation is important.

Weaknesses

1. It would be helpful if the authors could further elaborate on the paper’s main contributions. In particular, clarifying the fundamental challenge in integrating EWC with LoRA would strengthen the work. 2. The experiments are primarily conducted on artificial image datasets. Including additional experiments on large language models (LLMs) would make the evaluation more comprehensive and convincing. 3. The paper seems to overlook an important line of research on continual learning (continual

Reviewer 02Rating 8Confidence 4

Strengths

+ Strong theoretical grounding and extensive experiments across multiple benchmarks. + Well-written with clear motivation, method, and results. + Provides a memory-efficient, tunable, and high-performing solution for PECL.

Weaknesses

+ The Fisher estimation, though efficient, still introduces non-negligible memory overhead + EWC-LoRA is indeed not very original, but the reviewer acknowledged that it is meaningful to make existing methods work in PEFT and explain why.

Reviewer 03Rating 6Confidence 4

Strengths

- The paper presents an interesting idea of revisiting weight regularization in the low-rank regime, which effectively bridges traditional CL methods and PECL approaches. - The authors provide a comprehensive comparison of computational cost and parameter efficiency, clearly demonstrating how the proposed method performs relative to other LoRA-based continual learning baselines. - The proposed approach shows consistent improvement across multiple datasets and experimental setups, indicating good

Weaknesses

- Accuracy of the Hessian estimation. The paper relies on the empirical Fisher information matrix to estimate the importance of weights for each task. However, as discussed in Meta-CL [1], the Fisher matrix used in EWC-based methods tends to become stale and outdated over time, leading to inaccurate importance estimation. It remains unclear whether a similar issue [1] arises in the PECL setting adopted here. - Effectiveness under longer task sequences. When the number of tasks increases, it is u

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Face recognition and analysis · Advanced Neural Network Applications