GPrune-LLM: Generalization-Aware Structured Pruning for Large Language Models
Xiaoyun Liu, Divya Saxena, Jiannong Cao, Yuqing Zhao, Yiying Dong, Penghui Ruan

TL;DR
GPrune-LLM introduces a novel structured pruning method for large language models that improves cross-distribution generalization by accounting for neuron distribution sensitivity and adaptively selecting importance metrics.
Contribution
It proposes a generalization-aware pruning framework that partitions neurons, evaluates metric reliability, and adaptively learns sparsity, addressing limitations of existing importance estimation methods.
Findings
Improves cross-task generalization of pruned LLMs.
Reduces dependence on importance metric choice.
Achieves better performance at high sparsity levels.
Abstract
Structured pruning is widely used to compress large language models (LLMs), yet its effectiveness depends heavily on neuron importance estimation. Most existing methods estimate neuron importance from activation statistics on a single calibration dataset, which introduces calibration bias and degrades downstream cross-task generalization. We observe that neurons exhibit heterogeneous distribution sensitivity, with distribution-robust neurons maintaining consistent rankings across datasets and distribution-sensitive neurons showing high cross-dataset ranking variance. Based on this, we identify two structural limitations in existing methods. First, ranking all neurons within a shared space causes distribution-sensitive neurons that strongly activate on calibration inputs to dominate, crowding out distribution-robust neurons critical for out-of-distribution tasks. Second, applying…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education
