Q-S5: Towards Quantized State Space Models
Steven Abreu, Jens E. Pedersen, Kade M. Heckel, Alessandro Pierro

TL;DR
This paper explores the effects of quantization on the S5 state space model, demonstrating that fully quantized models maintain high accuracy and providing insights for deploying efficient, hardware-optimized sequence models on resource-constrained platforms.
Contribution
It systematically evaluates quantization techniques on S5 models, revealing how different components tolerate low-bit precision and guiding future hardware-efficient model development.
Findings
Fully quantized S5 models retain less than 1% accuracy loss on key tasks
Recurrent weights require at least 8-bit precision to avoid significant performance degradation
Post-training quantization is effective mainly for language-based tasks, while QAT is needed for others
Abstract
In the quest for next-generation sequence modeling architectures, State Space Models (SSMs) have emerged as a potent alternative to transformers, particularly for their computational efficiency and suitability for dynamical systems. This paper investigates the effect of quantization on the S5 model to understand its impact on model performance and to facilitate its deployment to edge and resource-constrained platforms. Using quantization-aware training (QAT) and post-training quantization (PTQ), we systematically evaluate the quantization sensitivity of SSMs across different tasks like dynamical systems modeling, Sequential MNIST (sMNIST) and most of the Long Range Arena (LRA). We present fully quantized S5 models whose test accuracy drops less than 1% on sMNIST and most of the LRA. We find that performance on most tasks degrades significantly for recurrent weights below 8-bit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsError Correcting Code Techniques · Simulation Techniques and Applications · Neural Networks and Applications
