Elastic Spectral State Space Models for Budgeted Inference
Dachuan Song, Xuan Wang

TL;DR
This paper introduces Elastic Spectral State Space Models (ES-SSM) that can be trained once and efficiently truncated to different sizes for resource-constrained inference, maintaining competitive performance across diverse tasks.
Contribution
The authors propose ES-SSM, a novel spectral state space model that enables flexible, runtime model scaling without retraining, addressing resource variability in real-world applications.
Findings
ES-SSM achieves competitive performance with modern models at various scales.
The model provides smooth performance curves across different truncation levels.
ES-SSM demonstrates versatility across multiple long-sequence benchmarks.
Abstract
Foundation models are typically trained at a fixed computational capacity, while real-world applications require deployment across platforms with different resource constraints. Current approaches usually rely on training families of model variants or model distillation, which requires additional training and supports only a pre-selected set of sizes rather than fine-grained adaptation at runtime. In this paper, we propose Elastic Spectral State Space Models (ES-SSM), which require only one-time training at full capacity, but can be directly truncated into arbitrary scales for budgeted, runtime inference without retraining. Our ES-SSM builds on Hankel spectral filtering over a state space model (SSM), coupled with a lightweight input-adaptive gate trained under randomized spectral budgets. Using a shared masked normalization rule over the ordered spectral channels, we encourage…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning
