GFD-SSD: Gated Fusion Double SSD for Multispectral Pedestrian Detection
Yang Zheng, Izzat H. Izzat, Shahrzad Ziaee

TL;DR
This paper introduces GFD-SSD, a novel multispectral pedestrian detection method that fuses color and thermal images using gated fusion units within SSD frameworks, achieving high accuracy and speed for autonomous driving.
Contribution
It proposes a new gated fusion strategy for combining features from dual SSDs on color and thermal images, improving detection performance and inference speed.
Findings
Outperforms stacked fusion methods in pedestrian detection accuracy.
Achieves the lowest miss rate on the KAIST dataset.
Runs twice as fast as Faster-RCNN based fusion networks.
Abstract
Pedestrian detection is an essential task in autonomous driving research. In addition to typical color images, thermal images benefit the detection in dark environments. Hence, it is worthwhile to explore an integrated approach to take advantage of both color and thermal images simultaneously. In this paper, we propose a novel approach to fuse color and thermal sensors using deep neural networks (DNN). Current state-of-the-art DNN object detectors vary from two-stage to one-stage mechanisms. Two-stage detectors, like Faster-RCNN, achieve higher accuracy, while one-stage detectors such as Single Shot Detector (SSD) demonstrate faster performance. To balance the trade-off, especially in the consideration of autonomous driving applications, we investigate a fusion strategy to combine two SSDs on color and thermal inputs. Traditional fusion methods stack selected features from each channel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Industrial Vision Systems and Defect Detection
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Convolution · Non Maximum Suppression · 1x1 Convolution · SSD
