Late Breaking Results: Quamba-SE: Soft-edge Quantizer for Activations in State Space Models
Yizhi Chen, Ahmed Hemani

TL;DR
Quamba-SE introduces a novel soft-edge quantizer for state space model activations that adaptively preserves outlier information, leading to improved zero-shot benchmark performance over previous quantization methods.
Contribution
It presents Quamba-SE, a new adaptive quantization method for SSM activations that outperforms existing techniques by preserving outlier information with multiple scales.
Findings
Achieves up to +2.68% accuracy on benchmarks.
Improves average accuracy by up to +0.83%.
Effectively preserves outlier information during quantization.
Abstract
We propose Quamba-SE, a soft-edge quantizer for State Space Model (SSM) activation quantization. Unlike existing methods, using standard INT8 operation, Quamba-SE employs three adaptive scales: high-precision for small values, standard scale for normal values, and low-precision for outliers. This preserves outlier information instead of hard clipping, while maintaining precision for other values. We evaluate on Mamba- 130M across 6 zero-shot benchmarks. Results show that Quamba- SE consistently outperforms Quamba, achieving up to +2.68% on individual benchmarks and up to +0.83% improvement in the average accuracy of 6 datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Speech Recognition and Synthesis · Adversarial Robustness in Machine Learning
