Human Pose Forecasting via Deep Markov Models
Sam Toyer, Anoop Cherian, Tengda Han, Stephen Gould

TL;DR
This paper introduces a deep generative model based on Deep Markov Models for long-range human pose forecasting, evaluated with a new large-scale dataset and a pose-based action classifier to better assess forecast quality.
Contribution
It presents a novel variational autoencoder framework using DMMs for human pose prediction, along with a new dataset and evaluation method for long-term forecasting.
Findings
Promising results on Ikea FA and NTU RGB+D datasets.
Better assessment of pose forecast quality with a pose-based classifier.
Effective long-range pose forecasting demonstrated.
Abstract
Human pose forecasting is an important problem in computer vision with applications to human-robot interaction, visual surveillance, and autonomous driving. Usually, forecasting algorithms use 3D skeleton sequences and are trained to forecast for a few milliseconds into the future. Long-range forecasting is challenging due to the difficulty of estimating how long a person continues an activity. To this end, our contributions are threefold: (i) we propose a generative framework for poses using variational autoencoders based on Deep Markov Models (DMMs); (ii) we evaluate our pose forecasts using a pose-based action classifier, which we argue better reflects the subjective quality of pose forecasts than distance in coordinate space; (iii) last, for evaluation of the new model, we introduce a 480,000-frame video dataset called Ikea Furniture Assembly (Ikea FA), which depicts humans…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · 3D Shape Modeling and Analysis
