The Effectiveness of Approximate Regularized Replay for Efficient Supervised Fine-Tuning of Large Language Models

Matthew Riemer; Erik Miehling; Miao Liu; Djallel Bouneffouf; Murray Campbell

arXiv:2512.22337·cs.LG·December 30, 2025

The Effectiveness of Approximate Regularized Replay for Efficient Supervised Fine-Tuning of Large Language Models

Matthew Riemer, Erik Miehling, Miao Liu, Djallel Bouneffouf, Murray Campbell

PDF

Open Access

TL;DR

This paper proposes an approximate replay method with regularization to improve the stability and effectiveness of LoRA-based fine-tuning of large language models, preserving knowledge while adapting to new tasks.

Contribution

It introduces a regularized approximate replay technique that mitigates catastrophic forgetting during efficient supervised fine-tuning of large language models.

Findings

01

Regularized replay preserves model knowledge during fine-tuning.

02

Method maintains model plasticity for new tasks.

03

Minimal computational overhead added to existing fine-tuning procedures.

Abstract

Although parameter-efficient fine-tuning methods, such as LoRA, only modify a small subset of parameters, they can have a significant impact on the model. Our instruction-tuning experiments show that LoRA-based supervised fine-tuning can catastrophically degrade model capabilities, even when trained on very small datasets for relatively few steps. With that said, we demonstrate that while the most straightforward approach (that is likely the most used in practice) fails spectacularly, small tweaks to the training procedure with very little overhead can virtually eliminate the problem. Particularly, in this paper we consider a regularized approximate replay approach which penalizes KL divergence with respect to the initial model and interleaves in data for next token prediction from a different, yet similar, open access corpus to what was used in pre-training. When applied to Qwen…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification