Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models
Junhao Zheng, Shengjie Qiu, Qianli Ma

TL;DR
This paper challenges the common belief that catastrophic forgetting is the main obstacle in incremental learning with pre-trained language models, revealing their inherent anti-forgetting ability and proposing a simple yet effective method called SEQ* that outperforms existing techniques.
Contribution
The paper demonstrates that pre-trained language models possess strong anti-forgetting capabilities and introduces SEQ*, a simple method that achieves competitive or better results with less training effort.
Findings
PLMs have significant inherent anti-forgetting ability.
SEQ* outperforms state-of-the-art IL methods in multiple NLP tasks.
Most existing IL methods underestimate PLMs' anti-forgetting capacity.
Abstract
Incremental Learning (IL) has been a long-standing problem in both vision and Natural Language Processing (NLP) communities. In recent years, as Pre-trained Language Models (PLMs) have achieved remarkable progress in various NLP downstream tasks, utilizing PLMs as backbones has become a common practice in recent research of IL in NLP. Most assume that catastrophic forgetting is the biggest obstacle to achieving superior IL performance and propose various techniques to overcome this issue. However, we find that this assumption is problematic. Specifically, we revisit more than 20 methods on four classification tasks (Text Classification, Intent Classification, Relation Extraction, and Named Entity Recognition) under the two most popular IL settings (Class-Incremental and Task-Incremental) and reveal that most of them severely underestimate the inherent anti-forgetting ability of PLMs.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
