On Pruning State-Space LLMs

Tamer Ghattas; Michael Hassid; Roy Schwartz

arXiv:2502.18886·cs.CL·October 7, 2025

On Pruning State-Space LLMs

Tamer Ghattas, Michael Hassid, Roy Schwartz

PDF

Open Access 2 Models 1 Video

TL;DR

This paper explores pruning techniques for state-space models (SSMs) used in large language models, demonstrating that some pruning methods maintain performance while others cause degradation, thus offering insights into model efficiency improvements.

Contribution

It adapts various pruning methods to SSM-based LLMs and evaluates their robustness, providing new understanding of pruning effects on these models.

Findings

01

WANDA pruning maintains performance well

02

Other pruning methods cause performance degradation

03

SSMs show robustness to certain pruning techniques

Abstract

Recent work proposed state-space models (SSMs) as an efficient alternative to transformer-based LLMs. Can these models be pruned to further reduce their computation costs? We adapt several pruning methods to the SSM structure, and apply them to four SSM-based LLMs across multiple tasks. We find that such models are quite robust to some pruning methods (e.g. WANDA), while using other methods lead to fast performance degradation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

On Pruning State-Space LLMs· underline

Taxonomy

TopicsTime Series Analysis and Forecasting · Parallel Computing and Optimization Techniques · Advanced Memory and Neural Computing