TRIM: Achieving Extreme Sparsity with Targeted Row-wise Iterative Metric-driven Pruning

Florentin Beck; William Rudman; Carsten Eickhoff

arXiv:2505.16743·cs.CL·October 14, 2025

TRIM: Achieving Extreme Sparsity with Targeted Row-wise Iterative Metric-driven Pruning

Florentin Beck, William Rudman, Carsten Eickhoff

PDF

1 Repo

TL;DR

TRIM introduces a targeted, iterative pruning method that allocates sparsity unevenly across model dimensions, significantly improving the performance and stability of highly sparse large language models.

Contribution

The paper presents TRIM, a novel dimension-wise pruning approach guided by quality metrics, enabling more effective and stable extreme sparsity in large language models.

Findings

01

Achieves state-of-the-art perplexity reduction at high sparsity levels.

02

Improves stability and quality retention across diverse LLMs.

03

Demonstrates significant performance gains at 80% sparsity.

Abstract

Large Language Models (LLMs) present significant computational and memory challenges due to their extensive size, making pruning essential for their efficient deployment. Existing one-shot pruning methods often apply uniform sparsity constraints across layers or within each layer, resulting in suboptimal performance, especially at high sparsity ratios. This work introduces TRIM (Targeted Row-wise Iterative Metric-driven pruning), a novel approach that applies varying sparsity ratios to individual output dimensions (rows) within each layer. TRIM employs an iterative adjustment process guided by quality metrics to optimize dimension-wise sparsity allocation, focusing on reducing variance in quality retention across outputs to preserve critical information. TRIM can be seamlessly integrated with existing layer-wise pruning strategies. Our evaluations on perplexity and zero-shot tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

flobk/trim
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning