Legs Over Arms: On the Predictive Value of Lower-Body Pose for Human Trajectory Prediction from Egocentric Robot Perception
Nhat Le, Daeun Song, Xuesu Xiao

TL;DR
This study demonstrates that using lower-body 3D skeletal keypoints significantly improves human trajectory prediction accuracy in egocentric robot perception, with potential for monocular panoramic sensing to inform social navigation.
Contribution
The paper systematically evaluates skeletal features for trajectory prediction, revealing the predictive power of lower-body 3D keypoints and biomechanical cues, and highlights the effectiveness of monocular panoramic vision.
Findings
Lower-body 3D keypoints reduce prediction error by 13%.
Adding biomechanical cues improves accuracy by 1-4%.
2D keypoints from panoramic images still provide useful motion cues.
Abstract
Predicting human trajectory is crucial for social robot navigation in crowded environments. While most existing approaches treat human as point mass, we present a study on multi-agent trajectory prediction that leverages different human skeletal features for improved forecast accuracy. In particular, we systematically evaluate the predictive utility of 2D and 3D skeletal keypoints and derived biomechanical cues as additional inputs. Through a comprehensive study on the JRDB dataset and another new dataset for social navigation with 360-degree panoramic videos, we find that focusing on lower-body 3D keypoints yields a 13% reduction in Average Displacement Error and augmenting 3D keypoint inputs with corresponding biomechanical cues provides a further 1-4% improvement. Notably, the performance gain persists when using 2D keypoint inputs extracted from equirectangular panoramic images,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Social Robot Interaction and HRI · Human Motion and Animation
