Exploring Stability-Plasticity Trade-offs for Continual Named Entity Recognition

Duzhen Zhang; Chenxing Li; Jiahua Dong; Qi Liu; Dong Yu

arXiv:2508.03259·cs.CL·August 6, 2025

Exploring Stability-Plasticity Trade-offs for Continual Named Entity Recognition

Duzhen Zhang, Chenxing Li, Jiahua Dong, Qi Liu, Dong Yu

PDF

TL;DR

This paper introduces a Stability-Plasticity Trade-off (SPT) method for Continual Named Entity Recognition that balances knowledge retention and acquisition by combining representation pooling, weight merging, and confidence-based pseudo-labeling.

Contribution

The paper proposes a novel SPT approach that improves CNER by balancing stability and plasticity through representation and weight strategies, addressing semantic shift challenges.

Findings

01

SPT outperforms previous CNER methods on multiple benchmarks.

02

The method effectively balances knowledge retention and new learning.

03

Experimental results validate the approach's superiority across diverse settings.

Abstract

Continual Named Entity Recognition (CNER) is an evolving field that focuses on sequentially updating an existing model to incorporate new entity types. Previous CNER methods primarily utilize Knowledge Distillation (KD) to preserve prior knowledge and overcome catastrophic forgetting, strictly ensuring that the representations of old and new models remain consistent. Consequently, they often impart the model with excessive stability (i.e., retention of old knowledge) but limited plasticity (i.e., acquisition of new knowledge). To address this issue, we propose a Stability-Plasticity Trade-off (SPT) method for CNER that balances these aspects from both representation and weight perspectives. From the representation perspective, we introduce a pooling operation into the original KD, permitting a level of plasticity by consolidating representation dimensions. From the weight perspective,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.