ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning

Mingluo Su; Huan Wang

arXiv:2603.05878·cs.CL·March 9, 2026

ROSE: Reordered SparseGPT for More Accurate One-Shot Large Language Models Pruning

Mingluo Su, Huan Wang

PDF

Open Access

TL;DR

ROSE introduces a reordering strategy for SparseGPT pruning that enhances the accuracy of one-shot large language model pruning by prioritizing weights with higher potential errors, leading to better performance.

Contribution

The paper proposes ROSE, a novel reordering method for SparseGPT that improves pruning accuracy by adaptively prioritizing weights based on estimated pruning loss.

Findings

01

ROSE outperforms SparseGPT on multiple LLM benchmarks.

02

Reordering weights based on loss estimates improves pruning effectiveness.

03

Empirical results show significant accuracy gains across various models.

Abstract

Pruning is widely recognized as an effective method for reducing the parameters of large language models (LLMs), potentially leading to more efficient deployment and inference. One classic and prominent path of LLM one-shot pruning is to leverage second-order gradients (i.e., Hessian), represented by the pioneering work SparseGPT. However, the predefined left-to-right pruning order in SparseGPT leads to suboptimal performance when the weights exhibit columnar patterns. This paper studies the effect of pruning order under the SparseGPT framework. The analyses lead us to propose ROSE, a reordered SparseGPT method that prioritizes weights with larger potential pruning errors to be pruned earlier. ROSE first performs pre-pruning to identify candidate weights for removal, and estimates both column and block pruning loss. Subsequently, two-level reordering is performed: columns within each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications