Mamba YOLO: A Simple Baseline for Object Detection with State Space   Model

Zeyu Wang; Chen Li; Huiying Xu; Xinzhong Zhu; Hongbo Li

arXiv:2406.05835·cs.CV·December 17, 2024·37 cites

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model

Zeyu Wang, Chen Li, Huiying Xu, Xinzhong Zhu, Hongbo Li

PDF

Open Access 2 Repos 1 Video

TL;DR

Mamba YOLO introduces a simple, efficient baseline for real-time object detection using a State Space Model to reduce complexity and enhance performance, achieving state-of-the-art results on COCO with fast inference.

Contribution

The paper presents ODMamba, a novel backbone with linear complexity SSM, and a multi-branch RG Block, enabling effective, real-time object detection without pretraining.

Findings

01

Achieves 7.5% mAP improvement on COCO benchmark.

02

Runs inference at 1.5 ms on a single 4090 GPU.

03

Outperforms previous methods in real-time object detection.

Abstract

Driven by the rapid development of deep learning technology, the YOLO series has set a new benchmark for real-time object detectors. Additionally, transformer-based structures have emerged as the most powerful solution in the field, greatly extending the model's receptive field and achieving significant performance improvements. However, this improvement comes at a cost as the quadratic complexity of the self-attentive mechanism increases the computational burden of the model. To address this problem, we introduce a simple yet effective baseline approach called Mamba YOLO. Our contributions are as follows: 1) We propose that the ODMamba backbone introduce a \textbf{S}tate \textbf{S}pace \textbf{M}odel (\textbf{SSM}) with linear complexity to address the quadratic complexity of self-attention. Unlike the other Transformer-base and SSM-base method, ODMamba is simple to train without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Mamba YOLO: A Simple Baseline for Object Detection with State Space Model· underline

Taxonomy

TopicsComputer Science and Engineering

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces · Sparse Evolutionary Training