The Effectiveness of Approximate Regularized Replay for Efficient Supervised Fine-Tuning of Large Language Models
Matthew Riemer, Erik Miehling, Miao Liu, Djallel Bouneffouf, Murray Campbell

TL;DR
This paper proposes an approximate replay method with regularization to improve the stability and effectiveness of LoRA-based fine-tuning of large language models, preserving knowledge while adapting to new tasks.
Contribution
It introduces a regularized approximate replay technique that mitigates catastrophic forgetting during efficient supervised fine-tuning of large language models.
Findings
Regularized replay preserves model knowledge during fine-tuning.
Method maintains model plasticity for new tasks.
Minimal computational overhead added to existing fine-tuning procedures.
Abstract
Although parameter-efficient fine-tuning methods, such as LoRA, only modify a small subset of parameters, they can have a significant impact on the model. Our instruction-tuning experiments show that LoRA-based supervised fine-tuning can catastrophically degrade model capabilities, even when trained on very small datasets for relatively few steps. With that said, we demonstrate that while the most straightforward approach (that is likely the most used in practice) fails spectacularly, small tweaks to the training procedure with very little overhead can virtually eliminate the problem. Particularly, in this paper we consider a regularized approximate replay approach which penalizes KL divergence with respect to the initial model and interleaves in data for next token prediction from a different, yet similar, open access corpus to what was used in pre-training. When applied to Qwen…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
