Deep Multi-camera People Detection
Tatjana Chavdarova, Fran\c{c}ois Fleuret

TL;DR
This paper introduces a deep learning approach for multi-camera people detection that leverages monocular datasets and joint multi-view processing, significantly improving accuracy over existing methods.
Contribution
It presents a novel end-to-end deep learning architecture for multi-view people detection that utilizes monocular datasets and parallel multi-stream processing.
Findings
Outperforms existing methods on PETS 2009 dataset
Provides a new three-camera HD dataset for research
Open-sources code and trained models
Abstract
This paper addresses the problem of multi-view people occupancy map estimation. Existing solutions for this problem either operate per-view, or rely on a background subtraction pre-processing. Both approaches lessen the detection performance as scenes become more crowded. The former does not exploit joint information, whereas the latter deals with ambiguous input due to the foreground blobs becoming more and more interconnected as the number of targets increases. Although deep learning algorithms have proven to excel on remarkably numerous computer vision tasks, such a method has not been applied yet to this problem. In large part this is due to the lack of large-scale multi-camera data-set. The core of our method is an architecture which makes use of monocular pedestrian data-set, available at larger scale then the multi-view ones, applies parallel processing to the multiple video…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Indoor and Outdoor Localization Technologies · Advanced Image and Video Retrieval Techniques
