Diet Your LLM: Dimension-wise Global Pruning of LLMs via Merging Task-specific Importance Score
Jimyung Hong, Jaehyung Kim

TL;DR
DIET is a training-free, dimension-wise global pruning method for LLMs that efficiently combines task-specific importance scores to improve model sparsity and accuracy.
Contribution
DIET introduces a novel, training-free approach that merges task-specific importance scores for effective structured pruning of LLMs at various sparsity levels.
Findings
At 20% sparsity, DIET improves average accuracy by nearly 10% over previous methods.
DIET requires only 100 samples per task for importance profiling.
The method is effective across multiple benchmarks and model scales.
Abstract
Large language models (LLMs) have demonstrated remarkable capabilities, but their massive scale poses significant challenges for practical deployment. Structured pruning offers a promising solution by removing entire dimensions or layers, yet existing methods face critical trade-offs: task-agnostic approaches cannot adapt to task-specific requirements, while task-aware methods require costly training to learn task adaptability. We propose DIET (Dimension-wise global pruning of LLMs via merging Task-wise importance scores), a training-free structured pruning method that combines dimension-level granularity with task-aware selection. DIET profiles activation magnitudes across tasks using only 100 samples per task, then applies majority voting to construct a single global mask. DIET does not require large costs from pre-computation or training. Experiments on seven zero-shot benchmarks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
