Occlusion-aware R-CNN: Detecting Pedestrians in a Crowd
Shifeng Zhang, Longyin Wen, Xiao Bian, Zhen Lei, Stan Z. Li

TL;DR
This paper introduces an occlusion-aware R-CNN that enhances pedestrian detection in crowded scenes by integrating prior human body structure and a new aggregation loss, achieving state-of-the-art results.
Contribution
The paper presents a novel OR-CNN with a part occlusion-aware pooling unit and aggregation loss for improved pedestrian detection under occlusion.
Findings
Achieves state-of-the-art results on CityPersons, ETH, and INRIA datasets.
Effectively handles occlusion through prior structure integration.
Performs on par with state-of-the-art methods on Caltech.
Abstract
Pedestrian detection in crowded scenes is a challenging problem since the pedestrians often gather together and occlude each other. In this paper, we propose a new occlusion-aware R-CNN (OR-CNN) to improve the detection accuracy in the crowd. Specifically, we design a new aggregation loss to enforce proposals to be close and locate compactly to the corresponding objects. Meanwhile, we use a new part occlusion-aware region of interest (PORoI) pooling unit to replace the RoI pooling layer in order to integrate the prior structure information of human body with visibility prediction into the network to handle occlusion. Our detector is trained in an end-to-end fashion, which achieves state-of-the-art results on three pedestrian detection datasets, i.e., CityPersons, ETH, and INRIA, and performs on-pair with the state-of-the-arts on Caltech.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
