Rethinking Layer Redundancy in Large Language Models: Calibration Objectives and Search for Depth Pruning

Minkyu Kim; Vincent-Daniel Yun; Youngrae Kim; Youngjin Heo; Suin Cho; Seong-hun Kim; Woosang Lim; Gaeul Kwon

arXiv:2604.24938·cs.LG·May 12, 2026

Rethinking Layer Redundancy in Large Language Models: Calibration Objectives and Search for Depth Pruning

Minkyu Kim, Vincent-Daniel Yun, Youngrae Kim, Youngjin Heo, Suin Cho, Seong-hun Kim, Woosang Lim, Gaeul Kwon

PDF

TL;DR

This paper investigates how calibration objectives influence depth pruning in large language models, revealing that the choice of objective significantly affects which layers are considered redundant, more than the search algorithm used.

Contribution

It introduces a functional perspective on layer redundancy, emphasizing the impact of calibration objectives over search algorithms in pruning decisions.

Findings

01

Different calibration objectives lead to distinct pruning patterns.

02

Perplexity and reasoning accuracy rankings often do not align.

03

Search algorithms tend to find similar solutions under a fixed objective.

Abstract

Depth pruning improves the inference efficiency of large language models by removing Transformer blocks. Prior work has largely treated layer redundancy as an inherent structural property of pretrained networks, emphasizing importance criteria and search algorithms for identifying removable layers. In contrast, we adopt a \emph{functional perspective}, where redundancy depends jointly on the model and the calibration objective, suggesting that a universal layer ranking may not exist. Through an empirical study across three LLM families, two calibration objectives, and seven search algorithms, we find that different objectives produce qualitatively different pruning patterns, while perplexity and downstream reasoning accuracy rankings often fail to align. In contrast, under a fixed objective, different search algorithms tend to converge to similar pruning solutions. Overall, our results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.