Socially and Contextually Aware Human Motion and Pose Forecasting
Vida Adeli, Ehsan Adeli, Ian Reid, Juan Carlos Niebles, Hamid, Rezatofighi

TL;DR
This paper introduces a unified end-to-end framework for predicting human trajectories and body poses by integrating scene and social context information, improving accuracy over existing methods.
Contribution
It presents a novel joint modeling approach for human motion and pose forecasting that incorporates scene and social cues within a single pipeline.
Findings
Achieves superior performance on social datasets
Effectively models both global and local human movements
Outperforms several baseline methods
Abstract
Smooth and seamless robot navigation while interacting with humans depends on predicting human movements. Forecasting such human dynamics often involves modeling human trajectories (global motion) or detailed body joint movements (local motion). Prior work typically tackled local and global human movements separately. In this paper, we propose a novel framework to tackle both tasks of human motion (or trajectory) and body skeleton pose forecasting in a unified end-to-end pipeline. To deal with this real-world problem, we consider incorporating both scene and social contexts, as critical clues for this prediction task, into our proposed framework. To this end, we first couple these two tasks by i) encoding their history using a shared Gated Recurrent Unit (GRU) encoder and ii) applying a metric as loss, which measures the source of errors in each task jointly as a single distance. Then,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGated Recurrent Unit
