Kernelized Sparse Fine-Tuning with Bi-level Parameter Competition for Vision Models
Shufan Shen, Junshu Sun, Shuhui Wang, Qingming Huang

TL;DR
This paper introduces SNELLA, a novel one-stage sparse fine-tuning method for vision models that improves task relevance detection, increases adaptation capacity through kernelized low-rank updates, and reduces memory usage while achieving state-of-the-art results.
Contribution
SNELLA combines kernelized low-rank updates with adaptive bi-level sparsity for efficient, high-performance vision model fine-tuning in a single stage.
Findings
SNELLA outperforms previous methods on classification, segmentation, and generation tasks.
Achieves 1.8% higher Top-1 accuracy on FGVC benchmark.
Reduces memory usage by 31.1%-39.9% across various models.
Abstract
Parameter-efficient fine-tuning (PEFT) aims to adapt pre-trained vision models to downstream tasks. Among PEFT paradigms, sparse tuning achieves remarkable performance by adjusting only the weights most relevant to downstream tasks, rather than densely tuning the entire weight matrix. Current methods follow a two-stage paradigm. First, it locates task-relevant weights by gradient information, which overlooks the parameter adjustments during fine-tuning and limits the performance. Second, it updates only the located weights by applying a sparse mask to the gradient of the weight matrix, which results in high memory usage due to the storage of all weight matrices in the optimizer. In this paper, we propose a one-stage method named SNELLA to overcome the above limitations. For memory usage, SNELLA selectively updates the weight matrix by adding it to another sparse matrix that is merged by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
