PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition
Chenhongyi Yang, Zehui Chen, Miguel Espinosa, Linus Ericsson, Zhenyu, Wang, Jiaming Liu, and Elliot J. Crowley

TL;DR
PlainMamba introduces a simple, scalable non-hierarchical state space model for visual recognition, leveraging continuous 2D scanning and direction-aware updates to improve spatial feature learning and efficiency.
Contribution
It adapts the Mamba state space model to images with novel 2D scanning and directional encoding, simplifying architecture and enhancing performance on visual tasks.
Findings
Achieves performance gains over previous non-hierarchical models
Competitive with hierarchical models on various visual recognition tasks
Requires less computation for high-resolution inputs
Abstract
We present PlainMamba: a simple non-hierarchical state space model (SSM) designed for general visual recognition. The recent Mamba model has shown how SSMs can be highly competitive with other architectures on sequential data and initial attempts have been made to apply it to images. In this paper, we further adapt the selective scanning process of Mamba to the visual domain, enhancing its ability to learn features from two-dimensional images by (i) a continuous 2D scanning process that improves spatial continuity by ensuring adjacency of tokens in the scanning sequence, and (ii) direction-aware updating which enables the model to discern the spatial relations of tokens by encoding directional information. Our architecture is designed to be easy to use and easy to scale, formed by stacking identical PlainMamba blocks, resulting in a model with constant width throughout all layers. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Vehicle License Plate Recognition
