DMM: Disparity-guided Multispectral Mamba for Oriented Object Detection in Remote Sensing
Minghang Zhou, Tianyu Li, Chaofan Qiao, Dongyu Xie, Guoqing Wang,, Ningjuan Ruan, Lin Mei, Yang Yang

TL;DR
This paper introduces DMM, an efficient multispectral object detection framework that uses disparity-guided fusion and attention modules to improve accuracy while reducing computational complexity.
Contribution
The paper proposes DMM, a novel multispectral detection method combining disparity-guided fusion, multi-scale attention, and target-aware auxiliary tasks for better performance.
Findings
Outperforms state-of-the-art methods on DroneVehicle and VEDAI datasets.
Achieves high detection accuracy with lower computational cost.
Effectively mitigates inter-modal and intra-modal discrepancies.
Abstract
Multispectral oriented object detection faces challenges due to both inter-modal and intra-modal discrepancies. Recent studies often rely on transformer-based models to address these issues and achieve cross-modal fusion detection. However, the quadratic computational complexity of transformers limits their performance. Inspired by the efficiency and lower complexity of Mamba in long sequence tasks, we propose Disparity-guided Multispectral Mamba (DMM), a multispectral oriented object detection framework comprised of a Disparity-guided Cross-modal Fusion Mamba (DCFM) module, a Multi-scale Target-aware Attention (MTA) module, and a Target-Prior Aware (TPA) auxiliary task. The DCFM module leverages disparity information between modalities to adaptively merge features from RGB and IR images, mitigating inter-modal conflicts. The MTA module aims to enhance feature representation by focusing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification
MethodsSoftmax · Attention Is All You Need · Attentive Walk-Aggregating Graph Neural Network
