Beyond Mamba: Enhancing State-space Models with Deformable Dilated Convolutions for Multi-scale Traffic Object Detection

Jun Li; Yingying Shi; Zhixuan Ruan; Nan Guo; Jianhua Xu

arXiv:2604.08038·cs.CV·April 10, 2026

Beyond Mamba: Enhancing State-space Models with Deformable Dilated Convolutions for Multi-scale Traffic Object Detection

Jun Li, Yingying Shi, Zhixuan Ruan, Nan Guo, Jianhua Xu

PDF

1 Repo

TL;DR

This paper introduces MDDCNet, a novel traffic object detection model that combines deformable dilated convolutions with Mamba blocks for improved multi-scale detection in complex scenes.

Contribution

It proposes a hybrid backbone with multi-scale deformable dilated convolutions and Mamba blocks, along with a channel-enhanced feed-forward network and an attention-based feature pyramid network.

Findings

01

Outperforms existing detectors on benchmark datasets

02

Effectively captures small objects with local details

03

Enhances multi-scale feature fusion and interaction

Abstract

In a real-world traffic scenario, varying-scale objects are usually distributed in a cluttered background, which poses great challenges to accurate detection. Although current Mamba-based methods can efficiently model long-range dependencies, they still struggle to capture small objects with abundant local details, which hinders joint modeling of local structures and global semantics. Moreover, state-space models exhibit limited hierarchical feature representation and weak cross-scale interaction due to flat sequential modeling and insufficient spatial inductive biases, leading to sub-optimal performance in complex scenes. To address these issues, we propose a Mamba with Deformable Dilated Convolutions Network (MDDCNet) for accurate traffic object detection in this study. In MDDCNet, a well-designed hybrid backbone with successive Multi-Scale Deformable Dilated Convolution (MSDDC)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Bettermea/MDDCNet
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.