EBFT: Effective and Block-Wise Fine-Tuning for Sparse LLMs
Song Guo, Fan Wu, Lei Zhang, Xiawu Zheng, Shengchuan Zhang, Fei Chao,, Yiyu Shi, Rongrong Ji

TL;DR
EBFT is a novel, efficient block-wise fine-tuning framework for sparse LLMs that minimizes reconstruction error, achieving superior performance with reduced resource consumption and fast training times.
Contribution
The paper introduces EBFT, a new method for fine-tuning sparse LLMs that optimizes block-wise reconstruction error, outperforming existing methods in accuracy and efficiency.
Findings
Achieves perplexity of 16.88 on Wikitext2 with LlamaV1-7B at 70% sparsity.
Outperforms state-of-the-art methods like DSnoT and LoRA in perplexity.
Fine-tuning takes approximately 30 minutes on a single 16GB GPU.
Abstract
Existing methods for fine-tuning sparse LLMs often suffer from resource-intensive requirements and high retraining costs. Additionally, many fine-tuning methods often rely on approximations or heuristic optimization strategies, which may lead to suboptimal solutions. To address these issues, we propose an efficient and fast framework for fine-tuning sparse LLMs based on minimizing reconstruction error. Our approach involves sampling a small dataset for calibration and utilizing backpropagation to iteratively optimize block-wise reconstruction error, on a block-by-block basis, aiming for optimal solutions. Extensive experiments on various benchmarks consistently demonstrate the superiority of our method over other baselines. For instance, on the Wikitext2 dataset with LlamaV1-7B at 70% sparsity, our proposed EBFT achieves a perplexity of 16.88, surpassing the state-of-the-art DSnoT with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Packet Processing and Optimization · Natural Language Processing Techniques
