TL;DR
This paper introduces a graph neural network-based method for multi-camera torso pose estimation in indoor environments, achieving accurate localization and orientation with low-resolution images, and compares different GNN architectures.
Contribution
It presents a novel approach using graph neural networks for early data fusion from multiple RGBD cameras in indoor pose estimation scenarios.
Findings
Achieved mean absolute error below 125 mm for location
Achieved orientation error below 10 degrees
Compared three different GNN architectures with promising results
Abstract
Estimating the location and orientation of humans is an essential skill for service and assistive robots. To achieve a reliable estimation in a wide area such as an apartment, multiple RGBD cameras are frequently used. Firstly, these setups are relatively expensive. Secondly, they seldom perform an effective data fusion using the multiple camera sources at an early stage of the processing pipeline. Occlusions and partial views make this second point very relevant in these scenarios. The proposal presented in this paper makes use of graph neural networks to merge the information acquired from multiple camera sources, achieving a mean absolute error below 125 mm for the location and 10 degrees for the orientation using low-resolution RGB images. The experiments, conducted in an apartment with three cameras, benchmarked two different graph neural network implementations and a third…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGraph Neural Network
