A Survey on Visual Mamba

Hanwei Zhang; Ying Zhu; Dan Wang; Lijun Zhang; Tianxiang Chen; Zi Ye

arXiv:2404.15956·cs.CV·April 29, 2024·1 cites

A Survey on Visual Mamba

Hanwei Zhang, Ying Zhu, Dan Wang, Lijun Zhang, Tianxiang Chen, Zi Ye

PDF

Open Access 1 Repo

TL;DR

This survey comprehensively reviews the development, applications, and potential of Mamba state space models in computer vision, highlighting their advantages, adaptations, and diverse use cases across various visual tasks.

Contribution

It provides the first in-depth analysis of Mamba models in computer vision, categorizing foundational and enhanced models, and exploring their applications in multiple vision tasks.

Findings

01

Mamba models effectively handle long-sequence vision tasks.

02

They are used as backbones in diverse vision applications.

03

Mamba models incorporate techniques like convolution, recurrence, and attention.

Abstract

State space models (SSMs) with selection mechanisms and hardware-aware architectures, namely Mamba, have recently demonstrated significant promise in long-sequence modeling. Since the self-attention mechanism in transformers has quadratic complexity with image size and increasing computational demands, the researchers are now exploring how to adapt Mamba for computer vision tasks. This paper is the first comprehensive survey aiming to provide an in-depth analysis of Mamba models in the field of computer vision. It begins by exploring the foundational concepts contributing to Mamba's success, including the state space model framework, selection mechanisms, and hardware-aware design. Next, we review these vision mamba models by categorizing them into foundational ones and enhancing them with techniques such as convolution, recurrence, and attention to improve their sophistication. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ziyangwang007/mamba-unet
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsUrban Design and Spatial Analysis