Kernelized Sparse Fine-Tuning with Bi-level Parameter Competition for Vision Models

Shufan Shen; Junshu Sun; Shuhui Wang; Qingming Huang

arXiv:2510.24037·cs.CV·October 29, 2025

Kernelized Sparse Fine-Tuning with Bi-level Parameter Competition for Vision Models

Shufan Shen, Junshu Sun, Shuhui Wang, Qingming Huang

PDF

TL;DR

This paper introduces SNELLA, a novel one-stage sparse fine-tuning method for vision models that improves task relevance detection, increases adaptation capacity through kernelized low-rank updates, and reduces memory usage while achieving state-of-the-art results.

Contribution

SNELLA combines kernelized low-rank updates with adaptive bi-level sparsity for efficient, high-performance vision model fine-tuning in a single stage.

Findings

01

SNELLA outperforms previous methods on classification, segmentation, and generation tasks.

02

Achieves 1.8% higher Top-1 accuracy on FGVC benchmark.

03

Reduces memory usage by 31.1%-39.9% across various models.

Abstract

Parameter-efficient fine-tuning (PEFT) aims to adapt pre-trained vision models to downstream tasks. Among PEFT paradigms, sparse tuning achieves remarkable performance by adjusting only the weights most relevant to downstream tasks, rather than densely tuning the entire weight matrix. Current methods follow a two-stage paradigm. First, it locates task-relevant weights by gradient information, which overlooks the parameter adjustments during fine-tuning and limits the performance. Second, it updates only the located weights by applying a sparse mask to the gradient of the weight matrix, which results in high memory usage due to the storage of all weight matrices in the optimizer. In this paper, we propose a one-stage method named SNELLA to overcome the above limitations. For memory usage, SNELLA selectively updates the weight matrix by adding it to another sparse matrix that is merged by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.