Rotation-Preserving Supervised Fine-Tuning

Hangzhan Jin; Tianwei Ni; Lu Li; Pierre-Luc Bacon; Mohammad Hamdaqa; Doina Precup

arXiv:2605.10973·cs.LG·May 13, 2026

Rotation-Preserving Supervised Fine-Tuning

Hangzhan Jin, Tianwei Ni, Lu Li, Pierre-Luc Bacon, Mohammad Hamdaqa, Doina Precup

PDF

1 Repo

TL;DR

RPSFT is a novel fine-tuning method that preserves pretrained singular subspaces to improve out-of-domain generalization and maintain representations.

Contribution

It introduces rotation-preserving regularization as an efficient proxy for Fisher-sensitive directions during supervised fine-tuning.

Findings

01

RPSFT improves in-domain/OOD trade-offs over standard SFT.

02

It better preserves pretrained representations.

03

Provides stronger initializations for downstream RL fine-tuning.

Abstract

Supervised fine-tuning (SFT) improves in-domain performance but can degrade out-of-domain (OOD) generalization. Prior work suggests that this degradation is related to changes in dominant singular subspaces of pretrained weight matrices. However, directly identifying loss-sensitive directions with Hessian or Fisher information is computationally expensive at LLM scale. In this work, we propose preserving projected rotations in pretrained singular subspaces as an efficient proxy for Fisher-sensitive directions, which we call Rotation-Preserving Supervised Fine-Tuning (RPSFT). RPSFT penalizes changes in the projected top- $k$ singular-vector block of each pretrained weight matrix, limiting unnecessary rotation while preserving task adaptation. Across model families and sizes trained on math reasoning data, RPSFT improves the in-domain/OOD trade-off over standard SFT and strong SFT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jinhangzhan/RPSFT
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.