Unveiling and Addressing Pseudo Forgetting in Large Language Models

Huashan Sun; Yizhe Yang; Yinghao Li; Jiawei Li; Yang Gao

arXiv:2411.11932·cs.LG·June 10, 2025

Unveiling and Addressing Pseudo Forgetting in Large Language Models

Huashan Sun, Yizhe Yang, Yinghao Li, Jiawei Li, Yang Gao

PDF

Open Access

TL;DR

This paper identifies pseudo forgetting in large language models, where performance drops are due to instruction activation failures rather than capability loss, and proposes interventions and a new framework to address it.

Contribution

It introduces the concept of pseudo forgetting, analyzes its internal mechanisms, and proposes the Rationale-Guidance Difficulty based Replay (RGD-R) framework to mitigate it.

Findings

01

Providing partial correct rationale restores performance.

02

Appending meaningless suffixes guides correct rationale generation.

03

RGD-R effectively reduces pseudo forgetting.

Abstract

Although substantial efforts have been made to mitigate catastrophic forgetting in continual learning, the intrinsic mechanisms are not well understood. In this work, we demonstrate the existence of "pseudo forgetting": the performance degradation on previous tasks is not attributed to a loss of capabilities, but rather to the failure of the instructions to activate the appropriate model abilities. We show that the model's performance on previous tasks can be restored through two simple interventions: (1) providing partial external correct rationale, and (2) appending semantically meaningless suffixes to the original instructions, to guide the generation of correct rationales. Through empirical analysis of the internal mechanisms governing rationale generation, we reveal that models exhibiting pseudo forgetting show reduced instruction dependence during rationale generation, leading to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling