AIRE-Prune: Asymptotic Impulse-Response Energy for State Pruning in State Space Models
Apurba Prasad Padhy, Fernando Camacho, Saibal Mukhopadhyay

TL;DR
AIRE-Prune is a novel structured pruning method for state space models that minimizes long-term output energy distortion by assigning each state an asymptotic impulse-response energy score, enabling effective reduction of model size with minimal accuracy loss.
Contribution
It introduces a new pruning technique based on asymptotic impulse-response energy, extending modal truncation to deep state space models for improved efficiency.
Findings
Achieves an average of 60.8% pruning across benchmarks.
Maintains 99.71% of original accuracy without retraining.
Reduces compute significantly while preserving performance.
Abstract
State space models (SSMs) often sacrifice capacity, search space, or stability to offset the memory and compute costs of large state dimensions. We introduce a structured post-training pruning method for SSMs -- AIRE-Prune (Asymptotic Impulse-Response Energy for State PRUN(E)) -- that reduces each layer's state dimension by directly minimizing long-run output-energy distortion. AIRE-Prune assigns every state a closed-form asymptotic impulse-response energy-based score, i.e., the total impulse-response energy it contributes over an infinite horizon (time), and normalizes these scores layer-wise to enable global cross-layer comparison and selection. This extends modal truncation from single systems to deep stacks and aligns pruning with asymptotic response energy rather than worst-case gain. Across diverse sequence benchmarks, AIRE-Prune reveals substantial redundancy in SISO and MIMO…
Peer Reviews
Decision·ICLR 2026 Poster
- Good empirical results: 60.8% average pruning with only 0.29% accuracy drop without retraining on LRA - Closed-form solution for importance scores seems efficient - Energy-based metric is well-motivated from control theory perspective and has clear mathematical grounding - Works for both SISO and MIMO SSMs
- Minor: missing reference details in line 117 - Limited to diagonal/diagonalizable SSMs and doesn't extend to input-selective models like Mamba - Only evaluated on Long Range Arena without speech and language benchmarks - No retraining experiments to show potential further improvements - Comparison mainly against only one recent baseline (LAST) - No discussion of computational overhead of computing energy scores - Limited analysis of why certain tasks (ListOps) are more sensitive to pruning
1. Novel Pruning Criterion: The paper proposes a new and theoretically motivated pruning technique for SSMs. 2. Strong Empirical Results: The method achieves state-of-the-art performance on the tested benchmark (S5 model), effectively demonstrating its capability to outperform prior pruning approaches in that specific setting.
1. **Clarity of Methodology (Sections 3.2 & 4):** The paper's core methodology is difficult to understand as written. Section 3.2 lacks clarity and citations to fully grasp. Furthermore, the theoretical part (Section 4) is hard to understand 2. **Limited Evaluation and Generalizability:** The empirical validation is a significant weakness. The approach is only tested on a single model (S5) and a single benchmark. This narrow scope makes it impossible to assess the generalizability of the tech
Strong Theoretical Foundation: The paper elegantly bridges classical control theory (modal truncation) with modern deep learning (layer-adaptive pruning). The energy-based criterion has clear physical interpretation: states with low Eᵢ = ‖C:,i‖²₂‖Bi,:‖²₂/(1-|λᵢ|²) contribute minimal long-run output energy. Practical Algorithm with Closed-Form Solutions: Unlike iterative methods, AIRE-Prune requires only: (1) compute Eᵢ per mode, (2) sort and compute prefix sums, (3) apply global threshold. No m
Severely Limited Experimental Scope: - Only S5 models evaluated: No experiments on Mamba, Mamba2, or hybrid architectures which dominate current practice - Only LRA benchmark: Missing speech (Speech Commands), language modeling (WikiText), or modern long-context tasks - Input-selective SSMs (Mamba) have input-dependent B, C—the energy formulation assumes these are fixed. How does AIRE extend to this case? Incomplete Comparison with LAST: - Table 1 shows different pruning ratios per task, maki
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Model Reduction and Neural Networks · Formal Methods in Verification
