Weight Spectra Induced Efficient Model Adaptation

Chongjie Si; Xuankun Yang; Muqing Liu; Yadao Wang; Xiaokang Yang; Wenbo Su; Bo Zheng; Wei Shen

arXiv:2505.23099·cs.LG·May 30, 2025

Weight Spectra Induced Efficient Model Adaptation

Chongjie Si, Xuankun Yang, Muqing Liu, Yadao Wang, Xiaokang Yang, Wenbo Su, Bo Zheng, Wei Shen

PDF

Open Access

TL;DR

This paper investigates how fine-tuning large models alters their weight matrices, revealing that it mainly amplifies top singular values, and proposes a method to improve adaptation by rescaling these dominant singular directions.

Contribution

It provides a systematic analysis of weight matrix changes during fine-tuning and introduces a novel rescaling method based on spectral properties for more efficient model adaptation.

Findings

01

Fine-tuning amplifies top singular values of weight matrices.

02

Dominant singular vectors are reoriented in task-specific directions.

03

Rescaling top singular directions improves adaptation performance.

Abstract

Large-scale foundation models have demonstrated remarkable versatility across a wide range of downstream tasks. However, fully fine-tuning these models incurs prohibitive computational costs, motivating the development of Parameter-Efficient Fine-Tuning (PEFT) methods such as LoRA, which introduces low-rank updates to pre-trained weights. Despite their empirical success, the underlying mechanisms by which PEFT modifies model parameters remain underexplored. In this work, we present a systematic investigation into the structural changes of weight matrices during fully fine-tuning. Through singular value decomposition (SVD), we reveal that fine-tuning predominantly amplifies the top singular values while leaving the remainder largely intact, suggesting that task-specific knowledge is injected into a low-dimensional subspace. Furthermore, we find that the dominant singular vectors are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning