PerfMamba: Performance Analysis and Pruning of Selective State Space Models
Abdullah Al Asif, Mobina Kashaniyan, Sixing Yu, Juan Pablo Mu\~noz, and Ali Jannesari

TL;DR
This paper provides an empirical performance analysis of selective state space models, introduces a pruning method to improve efficiency, and demonstrates significant speed and memory improvements without major accuracy loss.
Contribution
It offers a detailed profiling of Mamba models, identifies the SSM component as a computational bottleneck, and proposes a pruning technique for enhanced efficiency.
Findings
SSM component consumes most computational resources
Pruning low-activity states improves throughput and reduces memory
Achieved 1.14x speedup and 11.50% memory reduction
Abstract
Recent advances in sequence modeling have introduced selective SSMs as promising alternatives to Transformer architectures, offering theoretical computational efficiency and sequence processing advantages. A comprehensive understanding of selective SSMs in runtime behavior, resource utilization patterns, and scaling characteristics still remains unexplored, thus obstructing their optimal deployment and further architectural improvements. This paper presents a thorough empirical study of Mamba-1 and Mamba-2, systematically profiled for performance to assess the design principles that contribute to their efficiency in state-space modeling. A detailed analysis of computation patterns, memory access, I/O characteristics, and scaling properties was performed for sequence lengths ranging from 64 to 16384 tokens. Our findings show that the SSM component, a central part of the selective SSM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Embedded Systems Design Techniques · Advanced Data Storage Technologies
