Generalizable Multi-Camera 3D Pedestrian Detection
Jo\~ao Paulo Lima, Rafael Roberto, Lucas Figueiredo, Francisco, Sim\~oes, Veronica Teichrieb

TL;DR
This paper introduces a multi-camera 3D pedestrian detection method that operates without scene-specific training, utilizing pose-based localization, clique cover fusion, and optional re-identification to achieve state-of-the-art results on the WILDTRACK dataset.
Contribution
The paper presents a novel, training-free 3D pedestrian detection approach that combines pose heuristics, clique cover optimization, and re-identification for improved generalization.
Findings
Achieved a MODA of 0.569 on WILDTRACK
F-score of 0.78 surpassing existing methods
Effective without scene-specific training data
Abstract
We present a multi-camera 3D pedestrian detection method that does not need to train using data from the target scene. We estimate pedestrian location on the ground plane using a novel heuristic based on human body poses and person's bounding boxes from an off-the-shelf monocular detector. We then project these locations onto the world ground plane and fuse them with a new formulation of a clique cover problem. We also propose an optional step for exploiting pedestrian appearance during fusion by using a domain-generalizable person re-identification model. We evaluated the proposed approach on the challenging WILDTRACK dataset. It obtained a MODA of 0.569 and an F-score of 0.78, superior to state-of-the-art generalizable detection techniques.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
