BAANet: Learning Bi-directional Adaptive Attention Gates for Multispectral Pedestrian Detection
Xiaoxiao Yang, Yeqian Qiang, Huijie Zhu, Chunxiang Wang, Ming Yang

TL;DR
This paper introduces BAANet, a novel bi-directional attention-based fusion module for multispectral pedestrian detection, improving feature integration from RGB and TIR images under varying illumination conditions.
Contribution
The work proposes the BAA-Gate, an adaptive attention mechanism with bi-directional multi-stage fusion and illumination-based weighting for enhanced multispectral feature fusion.
Findings
Outperforms existing methods on KAIST dataset
Achieves superior detection accuracy with real-time speed
Demonstrates robustness to illumination changes
Abstract
Thermal infrared (TIR) image has proven effectiveness in providing temperature cues to the RGB features for multispectral pedestrian detection. Most existing methods directly inject the TIR modality into the RGB-based framework or simply ensemble the results of two modalities. This, however, could lead to inferior detection performance, as the RGB and TIR features generally have modality-specific noise, which might worsen the features along with the propagation of the network. Therefore, this work proposes an effective and efficient cross-modality fusion module called Bi-directional Adaptive Attention Gate (BAA-Gate). Based on the attention mechanism, the BAA-Gate is devised to distill the informative features and recalibrate the representations asymptotically. Concretely, a bi-direction multi-stage fusion strategy is adopted to progressively optimize features of two modalities and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Infrared Target Detection Methodologies
Methodsfast speak--How do I Speak to someone at Expedia?
