How Many Parameters Does Your Task Really Need? Task Specific Pruning with LLM-Sieve

Waleed Reda; Abhinav Jangda; Krishna Chintalapudi

arXiv:2505.18350·cs.LG·October 7, 2025

How Many Parameters Does Your Task Really Need? Task Specific Pruning with LLM-Sieve

Waleed Reda, Abhinav Jangda, Krishna Chintalapudi

PDF

Open Access

TL;DR

LLM-Sieve is a novel pruning framework that reduces large language models to minimal parameter sets needed for specific tasks, maintaining performance while revealing knowledge bottlenecks.

Contribution

It introduces output-aligned projections and adaptive pruning with a genetic algorithm, outperforming prior methods in model compression and interpretability.

Findings

01

Removes 20-75% of weights with only 1-5% accuracy loss

02

Reveals concentration of critical knowledge in bottleneck matrices

03

Compatible with LoRA fine-tuning and quantization

Abstract

As Large Language Models (LLMs) are increasingly deployed for narrow tasks in resource-constrained settings, a central question arises: how much of an LLM is truly necessary for a given task? We present LLM-Sieve, a framework that prunes LLMs down to the minimal parameter subset needed to preserve task performance. Our approach introduces two innovations: (i) output-aligned non-orthogonal projections, which yield more faithful low-rank approximations than traditional PCA/SVD by aligning directly with layer outputs; and (ii) adaptive pruning via a Genetic Algorithm, which automatically discovers matrix-specific pruning levels and exposes the uneven distribution of task-relevant knowledge. Across models from 3.8B to 70B parameters, LLM-Sieve removes 20-75% of weights with only 1-5% accuracy loss-substantially ahead of prior pruning methods. Beyond efficiency, our framework reveals…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Intelligent Tutoring Systems and Adaptive Learning · AI-based Problem Solving and Planning

MethodsPruning