Demystifying Language Model Forgetting with Low-rank Example Associations

Xisen Jin; Xiang Ren

arXiv:2406.14026·cs.LG·December 9, 2025

Demystifying Language Model Forgetting with Low-rank Example Associations

Xisen Jin, Xiang Ren

PDF

Open Access

TL;DR

This paper investigates how fine-tuning large language models causes forgetting of upstream knowledge, revealing that such forgetting can be modeled with low-rank matrices, enabling efficient prediction and mitigation of forgotten examples.

Contribution

It introduces a low-rank matrix approximation approach to analyze and predict forgetting in LLMs, outperforming prior semantic-based methods and enabling targeted mitigation.

Findings

01

Low-rank matrices effectively model forgetting patterns.

02

Matrix completion accurately predicts forgotten examples.

03

Upweighting predicted examples reduces forgetting during fine-tuning.

Abstract

Large language models (LLMs) suffer from forgetting of upstream knowledge when fine-tuned. Despite efforts on mitigating forgetting, few have investigated how forgotten upstream examples are dependent on newly learned tasks. Insights on such dependencies enable efficient and targeted mitigation of forgetting. In this paper, we empirically analyze forgetting that occurs in $N$ upstream examples of language modeling or instruction-tuning after fine-tuning LLMs on one of $M$ new tasks, visualized in $M \times N$ matrices. We show that the matrices are often well-approximated with low-rank matrices, indicating the dominance of simple associations between the learned tasks and forgotten upstream examples. Leveraging the analysis, we predict forgetting of upstream examples when fine-tuning LLMs on unseen tasks with matrix completion over the empirical associations. This enables fast…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques