Two-Stage Regularization-Based Structured Pruning for LLMs

Mingkuan Feng; Jinyang Wu; Siyuan Liu; Shuai Zhang; Hongjian Fang; Ruihan Jin; Feihu Che; Pengpeng Shao; Zhengqi Wen; Jianhua Tao

arXiv:2505.18232·cs.LG·April 21, 2026

Two-Stage Regularization-Based Structured Pruning for LLMs

Mingkuan Feng, Jinyang Wu, Siyuan Liu, Shuai Zhang, Hongjian Fang, Ruihan Jin, Feihu Che, Pengpeng Shao, Zhengqi Wen, Jianhua Tao

PDF

1 Repo

TL;DR

The paper introduces TRSP, a two-stage regularization-based structured pruning method for large language models that preserves knowledge and performance without extensive retraining.

Contribution

It proposes a novel two-stage regularization approach for structured pruning, improving knowledge retention and model performance in LLMs.

Findings

01

TRSP outperforms existing layer-wise structured pruning methods.

02

It achieves significant end-to-end acceleration.

03

No retraining is required after pruning.

Abstract

The deployment of large language models (LLMs) is largely hindered by their large number of parameters. Structural pruning has emerged as a promising solution. Prior structured pruning methods directly remove unimportant parameters based on certain metrics, which often causes knowledge loss and necessitates extensive retraining. To overcome this, we introduce a novel pruning method TRSP: Two-Stage Regularization-Based Structured Pruning for LLMs. Specifically, we multiply the output of each transformer layer by an initial learnable weight and iteratively learn these weights by adding their $ℓ_{1}$ -norm as a regularization term to the loss function, serving as the first-stage regularization. Subsequently, we apply additional regularization to the difference between the output and input of layers with smaller weights, encouraging the shift of knowledge to the preserved layers. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fmk345/TRSP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.