Multispectral Detection Transformer with Infrared-Centric Feature Fusion
Seongmin Hwang, Daeyoung Han, Moongu Jeon

TL;DR
This paper introduces IC-Fusion, a novel infrared-centric multispectral object detection method that emphasizes IR features and effectively fuses RGB and IR data, achieving superior results on benchmark datasets.
Contribution
The paper proposes a lightweight IR-centric fusion approach with a novel multi-scale feature distillation and cross-modal interaction modules, advancing multispectral detection performance.
Findings
Outperforms existing methods on FLIR and LLVIP benchmarks.
Effectively emphasizes IR high-frequency information for detection.
Demonstrates efficiency with a lightweight design.
Abstract
Multispectral object detection aims to leverage complementary information from visible (RGB) and infrared (IR) modalities to enable robust performance under diverse environmental conditions. Our key insight, derived from wavelet analysis and empirical observations, is that IR images contain structurally rich high-frequency information critical for object detection, making an infrared-centric approach highly effective. To capitalize on this finding, we propose Infrared-Centric Fusion (IC-Fusion), a lightweight and modality-aware sensor fusion method that prioritizes infrared features while effectively integrating complementary RGB semantic context. IC-Fusion adopts a compact RGB backbone and designs a novel fusion module comprising a Multi-Scale Feature Distillation (MSFD) block to enhance RGB features and a three-stage fusion block with a Cross-Modal Channel Shuffle Gate (CCSG), a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrared Target Detection Methodologies
MethodsADaptive gradient method with the OPTimal convergence rate · Channel Shuffle
