Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

Olaf Yunus Laitinen Imanov

arXiv:2601.18699·cs.LG·January 27, 2026

Mechanistic Analysis of Catastrophic Forgetting in Large Language Models During Continual Fine-tuning

Olaf Yunus Laitinen Imanov

PDF

Open Access

TL;DR

This paper provides a detailed mechanistic understanding of why catastrophic forgetting occurs in large language models during sequential fine-tuning, identifying key factors like gradient interference and representational drift.

Contribution

It offers the first comprehensive analysis of the mechanisms behind catastrophic forgetting in transformer-based LLMs during continual fine-tuning, across multiple model scales and task sequences.

Findings

01

Gradient interference in attention weights contributes to forgetting.

02

Representational drift occurs in intermediate layers during fine-tuning.

03

Forgetting severity correlates with task similarity and gradient alignment.

Abstract

Large language models exhibit remarkable performance across diverse tasks through pre-training and fine-tuning paradigms. However, continual fine-tuning on sequential tasks induces catastrophic forgetting, where newly acquired knowledge interferes with previously learned capabilities. Despite widespread observations of this phenomenon, the mechanistic understanding remains limited. Here, we present a comprehensive mechanistic analysis of catastrophic forgetting in transformer-based LLMs during sequential fine-tuning. Through systematic experiments across multiple model scales (109B to 400B total parameters) and task sequences, we identify three primary mechanisms driving forgetting: gradient interference in attention weights, representational drift in intermediate layers, and loss landscape flattening. We demonstrate that forgetting severity correlates strongly with task similarity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Memory Processes and Influences · Multimodal Machine Learning Applications