EvoP: Robust LLM Inference via Evolutionary Pruning

Shangyu Wu; Hongchao Du; Ying Xiong; Shuai Chen; Tei-Wei Kuo; Nan Guan; Chun Jason Xue

arXiv:2502.14910·cs.CL·August 14, 2025

EvoP: Robust LLM Inference via Evolutionary Pruning

Shangyu Wu, Hongchao Du, Ying Xiong, Shuai Chen, Tei-Wei Kuo, Nan Guan, Chun Jason Xue

PDF

TL;DR

EvoP is an evolutionary pruning framework that enhances large language model efficiency and robustness by intelligently searching for optimal pruning patterns using a diverse calibration dataset.

Contribution

EvoP introduces a novel evolutionary pruning method and a cluster-based dataset sampling strategy to improve LLM pruning performance and robustness.

Findings

01

EvoP outperforms existing pruning methods in accuracy and efficiency.

02

EvoP maintains high performance across various LLMs and tasks.

03

The framework is practical and scalable for real-world deployment.

Abstract

Large Language Models (LLMs) have achieved remarkable success in natural language processing tasks, but their massive size and computational demands hinder their deployment in resource-constrained environments. Existing model pruning methods address this issue by removing redundant structures (e.g., elements, channels, layers) from the model. However, these methods employ a heuristic pruning strategy, which leads to suboptimal performance. Besides, they also ignore the data characteristics when pruning the model. To overcome these limitations, we propose EvoP, an evolutionary pruning framework for robust LLM inference. EvoP first presents a cluster-based calibration dataset sampling (CCDS) strategy for creating a more diverse calibration dataset. EvoP then introduces an evolutionary pruning pattern searching (EPPS) method to find the optimal pruning pattern. Compared to existing model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.