Stabilizing LLM Supervised Fine-Tuning via Explicit Distributional Control

Xinyu Wang; Changzhi Sun; Yuanbin Wu; Xiaoling Wang

arXiv:2605.04468·cs.LG·May 7, 2026

Stabilizing LLM Supervised Fine-Tuning via Explicit Distributional Control

Xinyu Wang, Changzhi Sun, Yuanbin Wu, Xiaoling Wang

PDF

TL;DR

This paper introduces Anchored Learning, a distributional control framework for LLM fine-tuning that reduces catastrophic forgetting by stabilizing distributional updates, leading to improved performance and stability.

Contribution

It proposes a novel anchor-based method that interpolates between current and reference models, with theoretical guarantees and empirical validation on multiple benchmarks.

Findings

01

Significantly reduces performance degradation during fine-tuning.

02

Achieves near-optimal performance gains while maintaining stability.

03

Proven linear KL-divergence bound ensures stable distributional updates.

Abstract

Post-training large language models (LLMs) often suffers from catastrophic forgetting, where improvements on a target objective degrade previously acquired capabilities. Recent evidence suggests that this phenomenon is primarily driven by excessive distributional drift during optimization. Motivated by this perspective, we propose Anchored Learning, a simple framework that explicitly controls distributional updates during offline fine-tuning via a dynamically evolving moving anchor. Instead of matching a fixed reference distribution, the anchor interpolates between the current model and a frozen reference to construct an intermediate target that the model distills toward, transforming global fine-tuning into a sequence of local trust-region updates in distribution space. Theoretically, we prove this anchor-based update admits a linear KL-divergence upper bound per iteration, ensuring a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.