GradMAP: Faster Layer Pruning with Gradient Metric and Projection Compensation

Hao Liu; Guangyan Li; Wensheng Zhang; and Yongqiang Tang

arXiv:2602.14649·cs.CL·February 17, 2026

GradMAP: Faster Layer Pruning with Gradient Metric and Projection Compensation

Hao Liu, Guangyan Li, Wensheng Zhang, and Yongqiang Tang

PDF

Open Access

TL;DR

GradMAP introduces a fast, efficient layer pruning method for large language models that uses gradient metrics and projection compensation to maintain performance while significantly speeding up pruning.

Contribution

The paper presents a novel layer importance metric based on gradient magnitudes and a projection compensation technique, enabling faster and more effective layer pruning.

Findings

01

Achieves an average 4x speedup in pruning process.

02

Outperforms previous methods in maintaining model performance.

03

Reduces performance degradation through projection compensation.

Abstract

Large Language Models (LLMs) exhibit strong reasoning abilities, but their high computational costs limit their practical deployment. Recent studies reveal significant redundancy in LLMs layers, making layer pruning an active research topic. Layer pruning research primarily focuses on two aspects: measuring layer importance and recovering performance after pruning. Unfortunately, the present works fail to simultaneously maintain pruning performance and efficiency. In this study, we propose GradMAP, a faster layer pruning method with \textbf{Grad}ient \textbf{M}etric \textbf{A}nd \textbf{P}rojection compensation, which consists of two stages. In the first stage, we introduce a novel metric based on gradient magnitudes, enabling a global assessment of layer importance. Note that, it requires only a single backward propagation step per pruning decision, substantially enhancing pruning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques