An Efficient Sparse Fine-Tuning with Low Quantization Error via Neural Network Pruning

Cen-Jhih Li; Aditya Bhaskara

arXiv:2502.11439·cs.CL·July 28, 2025

An Efficient Sparse Fine-Tuning with Low Quantization Error via Neural Network Pruning

Cen-Jhih Li, Aditya Bhaskara

PDF

Open Access

TL;DR

This paper introduces a neural network pruning-based sparse fine-tuning framework that enhances memory efficiency by 20-50% while maintaining accuracy on language tasks.

Contribution

It proposes a novel sparse fine-tuning method leveraging neural network pruning to identify important neurons, improving efficiency without sacrificing performance.

Findings

01

Memory efficiency improved by 20-50%.

02

Maintains accuracy comparable to state-of-the-art methods.

03

Applicable to common language tasks.

Abstract

Fine-tuning is an important step in adapting foundation models such as large language models to downstream tasks. To make this step more accessible to users with limited computational budgets, it is crucial to develop fine-tuning methods that are memory and computationally efficient. Sparse Fine-tuning (SpFT) and Low-rank adaptation (LoRA) are two frameworks that have emerged for addressing this problem and have been adopted widely in practice. In this work, we develop a new SpFT framework, based on ideas from neural network pruning. At a high level, we first identify ``important'' neurons/nodes using feature importance metrics from network pruning (specifically, we use the structural pruning method), and then perform fine-tuning by restricting to weights involving these neurons. Experiments on common language tasks show our method improves SpFT's memory efficiency by 20-50\% while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Filter Design and Implementation · VLSI and FPGA Design Techniques · Low-power high-performance VLSI design

MethodsShrink and Fine-Tune · Pruning