MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection
Tianxiang Chen, Zi Ye, Zhentao Tan, Tao Gong, Yue Wu, Qi Chu, Bin Liu,, Nenghai Yu, Jieping Ye

TL;DR
This paper introduces MiM-ISTD, a nested Mamba-in-Mamba model that efficiently captures both global and local features for infrared small target detection, achieving superior accuracy and speed.
Contribution
The paper proposes a novel nested Mamba-in-Mamba architecture tailored for ISTD, combining global and local feature extraction with reduced computational costs.
Findings
MiM-ISTD is 8 times faster than state-of-the-art methods.
It reduces GPU memory usage by 62.2% on high-resolution images.
Experiments demonstrate superior accuracy and efficiency on benchmark datasets.
Abstract
Recently, infrared small target detection (ISTD) has made significant progress, thanks to the development of basic models. Specifically, the models combining CNNs with transformers can successfully extract both local and global features. However, the disadvantage of the transformer is also inherited, i.e., the quadratic computational complexity to sequence length. Inspired by the recent basic model with linear complexity for long-distance modeling, Mamba, we explore the potential of this state space model for ISTD task in terms of effectiveness and efficiency in the paper. However, directly applying Mamba achieves suboptimal performances due to the insufficient harnessing of local features, which are imperative for detecting small targets. Instead, we tailor a nested structure, Mamba-in-Mamba (MiM-ISTD), for efficient ISTD. It consists of Outer and Inner Mamba blocks to adeptly capture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Target Detection Methodologies
