TL;DR
This paper introduces SEFT, a novel fine-tuning method for sparse LLMs that dynamically evolves their sparse topology during training, maintaining sparsity and improving task performance efficiently.
Contribution
SEFT is the first method to adapt sparse LLMs during fine-tuning by evolving their sparse connectivity, enhancing performance while preserving sparsity.
Findings
SEFT outperforms existing baselines on various LLMs and benchmarks.
SEFT maintains desired sparsity levels throughout fine-tuning.
SEFT improves task-specific adaptation with better memory and time efficiency.
Abstract
Large language models (LLMs) have achieved remarkable success across various tasks but face deployment challenges due to their massive computational demands. While post-training pruning methods like SparseGPT and Wanda can effectively reduce the model size, but struggle to maintain model performance at high sparsity levels, limiting their utility for downstream tasks. Existing fine-tuning methods, such as full fine-tuning and LoRA, fail to preserve sparsity as they require updating the whole dense metrics, not well-suited for sparse LLMs. In this paper, we propose Sparsity Evolution Fine-Tuning (SEFT), a novel method designed specifically for sparse LLMs. SEFT dynamically evolves the sparse topology of pruned models during fine-tuning, while preserving the overall sparsity throughout the process. The strengths of SEFT lie in its ability to perform task-specific adaptation through a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
