ProxSparse: Regularized Learning of Semi-Structured Sparsity Masks for Pretrained LLMs
Hongyi Liu, Rajarshi Saha, Zhen Jia, Youngsuk Park, Jiaji Huang, Shoham Sabach, Yu-Xiang Wang, George Karypis

TL;DR
ProxSparse introduces a learning-based, regularized optimization framework for semi-structured pruning of large language models, enabling more effective and globally informed mask selection without additional weight updates.
Contribution
It presents a novel differentiable optimization approach for mask selection in semi-structured pruning, outperforming existing heuristic-based methods.
Findings
ProxSparse outperforms previous semi-structured pruning methods across 7 models.
The method achieves significant improvements in model efficiency without extra weight updates.
Extensive evaluations demonstrate the effectiveness of the learned pruning approach.
Abstract
Large Language Models (LLMs) have demonstrated exceptional performance in natural language processing tasks, yet their massive size makes serving them inefficient and costly. Semi-structured pruning has emerged as an effective method for model acceleration, but existing approaches are suboptimal because they focus on local, layer-wise optimizations using heuristic rules, failing to leverage global feedback. We present ProxSparse, a learning-based framework for mask selection enabled by regularized optimization. ProxSparse transforms the rigid, non-differentiable mask selection process into a smoother optimization procedure, allowing gradual mask exploration with flexibility. ProxSparse does not involve additional weight updates once the mask is determined. Our extensive evaluations on 7 widely used models show that ProxSparse consistently outperforms previously proposed semi-structured…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsInfrastructure Maintenance and Monitoring
