Pruning General Large Language Models into Customized Expert Models

Yirao Zhao; Guizhen Chen; Kenji Kawaguchi; Lidong Bing; Wenxuan Zhang

arXiv:2506.02561·cs.CL·June 4, 2025

Pruning General Large Language Models into Customized Expert Models

Yirao Zhao, Guizhen Chen, Kenji Kawaguchi, Lidong Bing, Wenxuan Zhang

PDF

Open Access

TL;DR

This paper introduces Cus-Prun, a novel pruning method that efficiently creates compact expert language models tailored to specific domains, tasks, or languages without post-training, outperforming existing approaches.

Contribution

The paper presents Cus-Prun, a new pruning technique that directly produces lightweight expert models along language, domain, and task dimensions without additional training.

Findings

01

Cus-Prun outperforms existing pruning methods in preserving model capabilities.

02

It effectively creates expert models tailored to specific scenarios.

03

The method works across various model sizes and families.

Abstract

Large language models (LLMs) have revolutionized natural language processing, yet their substantial model sizes often require substantial computational resources. To preserve computing resources and accelerate inference speed, it is crucial to prune redundant parameters, especially for experienced users who often need compact expert models tailored to specific downstream scenarios. However, most existing pruning methods focus on preserving the model's general capabilities, often requiring extensive post-training or suffering from degraded performance due to coarse-grained pruning. In this work, we design a $\underline{C u s}$ tom $\underline{P r u n}$ ing method ( $Cus-Prun$ ) to prune a large general model into a smaller lightweight expert model, which is positioned along the "language", "domain" and "task" dimensions. By identifying and pruning irrelevant neurons of each dimension,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods