Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality

Yuto Harada; Yusuke Yamauchi; Yusuke Oda; Yohei Oseki; Yusuke Miyao; Yu Takagi

arXiv:2506.14681·cs.CL·October 31, 2025

Massive Supervised Fine-tuning Experiments Reveal How Data, Layer, and Training Factors Shape LLM Alignment Quality

Yuto Harada, Yusuke Yamauchi, Yusuke Oda, Yohei Oseki, Yusuke Miyao, Yu Takagi

PDF

Open Access

TL;DR

This study systematically investigates how data choices, layer modifications, and training strategies influence the alignment quality of large language models through extensive supervised fine-tuning experiments.

Contribution

It provides a comprehensive analysis of factors affecting LLM fine-tuning, revealing key dataset properties, layer-wise changes, and the predictive power of perplexity for SFT success.

Findings

01

Perplexity reliably predicts SFT effectiveness.

02

Layer-wise weight changes correlate with performance improvements.

03

Certain training-task synergies are consistent across models.

Abstract

Supervised fine-tuning (SFT) is a critical step in aligning large language models (LLMs) with human instructions and values, yet many aspects of SFT remain poorly understood. We trained a wide range of base models on a variety of datasets including code generation, mathematical reasoning, and general-domain tasks, resulting in 1,000+ SFT models under controlled conditions. We then identified the dataset properties that matter most and examined the layer-wise modifications introduced by SFT. Our findings reveal that some training-task synergies persist across all models while others vary substantially, emphasizing the importance of model-specific strategies. Moreover, we demonstrate that perplexity consistently predicts SFT effectiveness, often surpassing superficial similarity between the training data and the benchmark, and that mid-layer weight changes correlate most strongly with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Materials Characterization Techniques

MethodsShrink and Fine-Tune · Balanced Selection