View Invariant Human Body Detection and Pose Estimation from Multiple Depth Sensors
Walid Bekhtaoui, Ruhan Sa, Brian Teixeira, Vivek Singh, Klaus, Kirchberg, Yao-jen Chang, Ankur Kapoor

TL;DR
This paper introduces Point R-CNN, an end-to-end multi-person 3D pose estimation network that effectively fuses multiple point cloud sources for indoor monitoring, outperforming existing methods especially in challenging scenarios.
Contribution
The paper presents a novel multi-sensor 3D pose estimation network that fuses point clouds at input level, improving robustness and accuracy over complex indoor scenes.
Findings
Outperforms state-of-the-art models in multi-sensor 3D pose estimation
Effective handling of sensor failures and cluttered scenes
Demonstrates robustness in real-world indoor scenarios
Abstract
Point cloud based methods have produced promising results in areas such as 3D object detection in autonomous driving. However, most of the recent point cloud work focuses on single depth sensor data, whereas less work has been done on indoor monitoring applications, such as operation room monitoring in hospitals or indoor surveillance. In these scenarios multiple cameras are often used to tackle occlusion problems. We propose an end-to-end multi-person 3D pose estimation network, Point R-CNN, using multiple point cloud sources. We conduct extensive experiments to simulate challenging real world cases, such as individual camera failures, various target appearances, and complex cluttered scenes with the CMU panoptic dataset and the MVOR operation room dataset. Unlike most of the previous methods that attempt to use multiple sensor information by building complex fusion models, which often…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Neural Network Applications
