Efficient Post-Training Pruning of Large Language Models with Statistical Correction

Peiqi Yu; Jinhao Wang; Xinyi Sui; Nam Ling; Wei Wang; Wei Jiang

arXiv:2602.07375·cs.CL·February 10, 2026

Efficient Post-Training Pruning of Large Language Models with Statistical Correction

Peiqi Yu, Jinhao Wang, Xinyi Sui, Nam Ling, Wei Wang, Wei Jiang

PDF

Open Access

TL;DR

This paper introduces a lightweight post-training pruning method for large language models that uses statistical properties to improve pruning quality without retraining, balancing efficiency and fidelity.

Contribution

The authors propose a novel pruning framework leveraging first-order statistics and energy compensation, eliminating the need for retraining or complex computations.

Findings

01

Improves pruning performance across multiple LLMs and tasks.

02

Maintains computational efficiency comparable to heuristic methods.

03

Effectively reduces model size while preserving accuracy.

Abstract

Post-training pruning is an effective approach for reducing the size and inference cost of large language models (LLMs), but existing methods often face a trade-off between pruning quality and computational efficiency. Heuristic pruning methods are efficient but sensitive to activation outliers, while reconstruction-based approaches improve fidelity at the cost of heavy computation. In this work, we propose a lightweight post-training pruning framework based on first-order statistical properties of model weights and activations. During pruning, channel-wise statistics are used to calibrate magnitude-based importance scores, reducing bias from activation-dominated channels. After pruning, we apply an analytic energy compensation to correct distributional distortions caused by weight removal. Both steps operate without retraining, gradients, or second-order information. Experiments across…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Natural Language Processing Techniques