Pretraining Induces a Reusable Spectral Basis for Downstream Task Adaptation

Junjie Yu; Yue Wang; Zihan Deng; Yan Zhu; Wenxiao Ma; Quanying Liu

arXiv:2605.07302·cs.LG·May 11, 2026

Pretraining Induces a Reusable Spectral Basis for Downstream Task Adaptation

Junjie Yu, Yue Wang, Zihan Deng, Yan Zhu, Wenxiao Ma, Quanying Liu

PDF

TL;DR

This paper reveals that pretrained models establish a stable spectral basis that is shared across tasks, and demonstrates that freezing these spectral directions with minimal trainable parameters can effectively adapt models to new tasks.

Contribution

The study uncovers the spectral stability of pretrained models across tasks and introduces a parameter-efficient adaptation method based on spectral coefficients.

Findings

01

Leading singular vectors are highly stable during finetuning.

02

Pretraining on larger datasets enhances spectral stability under shifts.

03

Freezing spectral vectors and tuning coefficients performs well on downstream tasks.

Abstract

Finetuning pretrained models occurs in a low-dimensional subspace of the full parameter space. Prior work has focused on characterizing this optimization subspace, but largely ignored the complementary question: why do certain directions remain unexplored during finetuning? Are these stable directions irrelevant to downstream tasks, or do they already encode task-relevant structure that requires no further adjustment? Answering this question is central to understanding how pretrained knowledge transfers. Through systematic spectral analysis across vision and language models, we show that the leading singular vectors of pretrained weight matrices remain highly stable under finetuning and are shared across unrelated downstream tasks, revealing that pretraining establishes a reusable spectral coordinate system. Models pretrained on larger datasets exhibit greater spectral stability under…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.