Pruning as a Domain-specific LLM Extractor
Nan Zhang, Yanchi Liu, Xujiang Zhao, Wei Cheng, Runxue Bao, Rui Zhang,, Prasenjit Mitra, Haifeng Chen

TL;DR
This paper presents D-Pruner, a novel unstructured dual-pruning method that efficiently compresses large language models for domain-specific applications by preserving both general capabilities and domain knowledge.
Contribution
It introduces a new pruning technique that identifies and retains crucial weights for both general language skills and domain-specific knowledge, improving domain adaptation of LLMs.
Findings
Effective domain-specific compression demonstrated in healthcare and legal tasks.
Preserves general language capabilities while reducing model size.
Outperforms existing pruning methods in domain adaptation scenarios.
Abstract
Large Language Models (LLMs) have exhibited remarkable proficiency across a wide array of NLP tasks. However, the escalation in model size also engenders substantial deployment costs. While few efforts have explored model pruning techniques to reduce the size of LLMs, they mainly center on general or task-specific weights. This leads to suboptimal performance due to lacking specificity on the target domain or generality on different tasks when applied to domain-specific challenges. This work introduces an innovative unstructured dual-pruning methodology, D-Pruner, for domain-specific compression on LLM. It extracts a compressed, domain-specific, and task-agnostic LLM by identifying LLM weights that are pivotal for general capabilities, like linguistic capability and multi-task solving, and domain-specific knowledge. More specifically, we first assess general weight importance by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing
MethodsPruning
