Robust Pedestrian Detection with Uncertain Modality
Qian Bie, Xiao Wang, Bin Yang, Zhixi Yu, Jun Chen, Xin Xu

TL;DR
This paper introduces AUNet, a novel network for robust pedestrian detection that effectively handles uncertain and incomplete modality inputs by validating and adaptively fusing RGB, NIR, and TIR data.
Contribution
It proposes the AUNet framework with UMVR and MAI modules to improve pedestrian detection under unpredictable modality availability, supported by a new TRNT dataset.
Findings
AUNet outperforms existing methods on the TRNT dataset.
UMVR effectively validates modality availability in uncertain conditions.
MAI enhances modality fusion adaptively based on input reliability.
Abstract
Existing cross-modal pedestrian detection (CMPD) employs complementary information from RGB and thermal-infrared (TIR) modalities to detect pedestrians in 24h-surveillance systems.RGB captures rich pedestrian details under daylight, while TIR excels at night. However, TIR focuses primarily on the person's silhouette, neglecting critical texture details essential for detection. While the near-infrared (NIR) captures texture under low-light conditions, which effectively alleviates performance issues of RGB and detail loss in TIR, thereby reducing missed detections. To this end, we construct a new Triplet RGB-NIR-TIR (TRNT) dataset, comprising 8,281 pixel-aligned image triplets, establishing a comprehensive foundation for algorithmic research. However, due to the variable nature of real-world scenarios, imaging devices may not always capture all three modalities simultaneously. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Gait Recognition and Analysis
