3D Random Occlusion and Multi-Layer Projection for Deep Multi-Camera Pedestrian Localization
Rui Qiu, Ming Xu, Yuyao Yan, Jeremy S. Smith, Xi Yang

TL;DR
This paper introduces a novel data augmentation and multi-layer projection technique for deep multi-camera pedestrian detection, significantly improving accuracy under occlusion conditions.
Contribution
It proposes a 3D occlusion simulation and multi-plane feature projection method to enhance multi-view pedestrian localization.
Findings
Improved detection accuracy over state-of-the-art methods
Effective reduction of overfitting in training
Enhanced utilization of multi-view features
Abstract
Although deep-learning based methods for monocular pedestrian detection have made great progress, they are still vulnerable to heavy occlusions. Using multi-view information fusion is a potential solution but has limited applications, due to the lack of annotated training samples in existing multi-view datasets, which increases the risk of overfitting. To address this problem, a data augmentation method is proposed to randomly generate 3D cylinder occlusions, on the ground plane, which are of the average size of pedestrians and projected to multiple views, to relieve the impact of overfitting in the training. Moreover, the feature map of each view is projected to multiple parallel planes at different heights, by using homographies, which allows the CNNs to fully utilize the features across the height of each pedestrian to infer the locations of pedestrians on the ground plane. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Remote Sensing and LiDAR Applications
