Mamba-R: Vision Mamba ALSO Needs Registers

Feng Wang; Jiahao Wang; Sucheng Ren; Guoyizhe Wei; Jieru Mei; Wei Shao; Yuyin Zhou; Alan Yuille; Cihang Xie

arXiv:2405.14858·cs.CV·June 2, 2025·3 cites

Mamba-R: Vision Mamba ALSO Needs Registers

Feng Wang, Jiahao Wang, Sucheng Ren, Guoyizhe Wei, Jieru Mei, Wei Shao, Yuyin Zhou, Alan Yuille, Cihang Xie

PDF

Open Access 1 Repo

TL;DR

Mamba-R introduces register tokens into Vision Mamba to reduce artifacts, improve focus on meaningful regions, and enhance performance, especially at larger scales, demonstrated on ImageNet and segmentation tasks.

Contribution

This paper proposes Mamba-R, a novel architecture that incorporates register tokens into Vision Mamba to mitigate artifacts and improve scalability and accuracy.

Findings

01

Mamba-R achieves 83.0% accuracy on ImageNet with a base model.

02

Scaling Mamba-R to 341M parameters yields 83.6% accuracy.

03

Qualitative results show cleaner, more focused feature maps.

Abstract

Similar to Vision Transformers, this paper identifies artifacts also present within the feature maps of Vision Mamba. These artifacts, corresponding to high-norm tokens emerging in low-information background areas of images, appear much more severe in Vision Mamba -- they exist prevalently even with the tiny-sized model and activate extensively across background regions. To mitigate this issue, we follow the prior solution of introducing register tokens into Vision Mamba. To better cope with Mamba blocks' uni-directional inference paradigm, two key modifications are introduced: 1) evenly inserting registers throughout the input token sequence, and 2) recycling registers for final decision predictions. We term this new architecture Mamba-R. Qualitative observations suggest, compared to vanilla Vision Mamba, Mamba-R's feature maps appear cleaner and more focused on semantically meaningful…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wangf3014/mamba-reg
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques