Sparsity Induction for Accurate Post-Training Pruning of Large Language Models

Minhao Jiang; Zhikai Li; Xuewen Liu; Jing Zhang; Mengjuan Chen; Qingyi Gu

arXiv:2602.21652·cs.CL·February 26, 2026

Sparsity Induction for Accurate Post-Training Pruning of Large Language Models

Minhao Jiang, Zhikai Li, Xuewen Liu, Jing Zhang, Mengjuan Chen, Qingyi Gu

PDF

Open Access

TL;DR

This paper introduces a novel sparsity induction method that enhances post-training pruning of large language models by promoting higher sparsity before pruning, leading to better performance recovery and efficiency.

Contribution

It proposes distribution and feature level sparsity promotion techniques that improve pruning effectiveness without extra inference costs.

Findings

01

Achieves higher sparsity with better performance recovery.

02

Enhances pruning across diverse models and tasks.

03

No additional inference overhead introduced.

Abstract

Large language models have demonstrated capabilities in text generation, while their increasing parameter scales present challenges in computational and memory efficiency. Post-training sparsity (PTS), which reduces model cost by removing weights from dense networks, is an effective approach. However, native dense matrices lack high sparsity, making existing approaches that directly remove weights disrupt model states, resulting in unsatisfactory performance recovery even with post-tuning. We propose Sparsity Induction, which promotes models toward higher sparsity at both distribution and feature levels before pruning, to push the limits of PTS. At the distribution level, we enhance distributional sparsity through mathematically equivalent scaling transformations, which are fully absorbable and incur no extra parameters or inference-time overhead. At the feature level, we introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Generative Adversarial Networks and Image Synthesis