TL;DR
This paper introduces a modular RGB-D scene understanding framework for railway platform safety monitoring, including a new dataset and evaluation of detection and tracking methods, showing improved accuracy especially in challenging conditions.
Contribution
It presents a flexible RGB-D analysis scheme, a novel railway platform dataset with annotations, and demonstrates enhanced detection and tracking performance using combined spatial and learned features.
Findings
Depth-based spatial info improves detection accuracy.
Combined modalities outperform single-method approaches.
Enhanced robustness in occlusion scenarios.
Abstract
Automated monitoring and analysis of passenger movement in safety-critical parts of transport infrastructures represent a relevant visual surveillance task. Recent breakthroughs in visual representation learning and spatial sensing opened up new possibilities for detecting and tracking humans and objects within a 3D spatial context. This paper proposes a flexible analysis scheme and a thorough evaluation of various processing pipelines to detect and track humans on a ground plane, calibrated automatically via stereo depth and pedestrian detection. We consider multiple combinations within a set of RGB- and depth-based detection and tracking modalities. We exploit the modular concepts of Meshroom [2] and demonstrate its use as a generic vision processing pipeline and scalable evaluation framework. Furthermore, we introduce a novel open RGB-D railway platform dataset with annotations to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
