Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models

Haidong Kang; Lihong Lin; Enneng Yang; Hongning Dai; Hao Wang

arXiv:2511.15390·cs.CV·November 20, 2025

Breaking Expert Knowledge Limits: Self-Pruning for Large Language Models

Haidong Kang, Lihong Lin, Enneng Yang, Hongning Dai, Hao Wang

PDF

Open Access

TL;DR

This paper introduces AutoPrune, a novel method enabling large language models to automatically design their own pruning algorithms, overcoming expert knowledge limits and addressing outlier value issues for improved performance.

Contribution

AutoPrune is the first approach allowing LLMs to self-prune without expert-designed algorithms, utilizing GCoT for prompt optimization and SDSA for adaptive sparsity, significantly enhancing pruning performance.

Findings

01

AutoPrune outperforms state-of-the-art pruning methods.

02

GCoT improves reasoning in pruning algorithm design.

03

SDSA mitigates performance loss at high pruning ratios.

Abstract

Large language models (LLMs) have achieved remarkable performance on a wide range of tasks, hindering real-world deployment due to their massive size. Existing pruning methods (e.g., Wanda) tailored for LLMs rely heavily on manual design pruning algorithms, thereby leading to \textit{huge labor costs} and \textit{requires expert knowledge}. Furthermore, we are the first to identify the serious \textit{outlier value issue} behind dramatic performance degradation under high pruning ratios that are caused by uniform sparsity, raising an additional concern about how to design adaptive pruning sparsity ideal for LLMs. Can LLMs prune by themselves? In this work, we introduce an affirmative answer by proposing a novel pruning method called \textbf{AutoPrune}, which first overcomes expert knowledge limits by leveraging LLMs to design optimal pruning algorithms for themselves automatically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques