Decomposing the Depth Profile of Fine-Tuning

Jayadev Billa

arXiv:2604.17177·cs.LG·April 21, 2026

Decomposing the Depth Profile of Fine-Tuning

Jayadev Billa

PDF

TL;DR

This paper investigates how fine-tuning reshapes neural network representations across different architectures and scales, revealing consistent patterns and the influence of training controls on the depth profile of change.

Contribution

It provides a comprehensive analysis of the depth profile of representational change during fine-tuning across diverse models and introduces a control method affecting this profile.

Findings

01

Representational change concentrates near output layers in most training runs.

02

Architectural differences influence the persistence of the depth profile under control.

03

Profile shape correlates with training objectives, initialization, and architecture.

Abstract

Fine-tuning adapts pretrained networks to new objectives. Whether the resulting depth profile of representational change reflects an intrinsic property of the model or the magnitude of gradient flow has not been tested directly. We measure this profile across 240 fine-tuning runs spanning 15 models in four architecture families (encoder and decoder transformers, a state-space model, and an RNN) at scales from 125M to 6.9B parameters. Representational change concentrates in output-proximal layers in every standard-training run except one. We apply a per-layer control that equalizes $∥Δ W ∥/∥ W ∥$ across layers after each optimizer step. Under this control, the profile persists in some conditions and collapses in others. At 125M--350M, sequential-block architectures (BERT, OPT, GPT-2) retain the slope across tested objectives while parallel-block architectures (Pythia, CodeGen)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.