Linearization Explains Fine-Tuning in Large Language Models

Zahra Rahimi Afzal; Tara Esmaeilbeig; Mojtaba Soltanalian; Mesrob I. Ohannessian

arXiv:2602.08239·cs.LG·February 10, 2026

Linearization Explains Fine-Tuning in Large Language Models

Zahra Rahimi Afzal, Tara Esmaeilbeig, Mojtaba Soltanalian, Mesrob I. Ohannessian

PDF

Open Access 1 Video

TL;DR

This paper explores how linearization explains the effectiveness of parameter-efficient fine-tuning in large language models, linking NTK spectral properties to adaptation performance and providing theoretical and empirical insights.

Contribution

It introduces a linearization perspective to understand PEFT, analyzes the NTK spectrum's role in fine-tuning, and validates findings with experiments on LoRA and LLMs.

Findings

01

Strong correlation between NTK eigenvalues and model performance.

02

Spectral perturbation bounds inform layer selection for fine-tuning.

03

Linearization approximates fine-tuning dynamics under regularization.

Abstract

Parameter-Efficient Fine-Tuning (PEFT) is a popular class of techniques that strive to adapt large models in a scalable and resource-efficient manner. Yet, the mechanisms underlying their training performance and generalization remain underexplored. In this paper, we provide several insights into such fine-tuning through the lens of linearization. Fine-tuned models are often implicitly encouraged to remain close to the pretrained model. By making this explicit, using an Euclidean distance inductive bias in parameter space, we show that fine-tuning dynamics become equivalent to learning with the positive-definite neural tangent kernel (NTK). We specifically analyze how close the fully linear and the linearized fine-tuning optimizations are, based on the strength of the regularization. This allows us to be pragmatic about how good a model linearization is when fine-tuning large language…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Linearization Explains Fine-Tuning in Large Language Models· slideslive

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Speech Recognition and Synthesis