Learning to Predict Diverse Human Motions from a Single Image via Mixture Density Networks
Chunzhi Gu, Yan Zhao, Chao Zhang

TL;DR
This paper introduces a novel method using mixture density networks to predict diverse future human motions from a single image, addressing the challenge of limited input information and stochastic motion uncertainty.
Contribution
The approach enables generation of multiple plausible human motions from a single image, incorporating energy-based loss functions for improved coherence and accuracy.
Findings
Effective in generating diverse motion hypotheses
Achieves high prediction accuracy on benchmark datasets
Outperforms existing methods in diversity and coherence
Abstract
Human motion prediction, which plays a key role in computer vision, generally requires a past motion sequence as input. However, in real applications, a complete and correct past motion sequence can be too expensive to achieve. In this paper, we propose a novel approach to predicting future human motions from a much weaker condition, i.e., a single image, with mixture density networks (MDN) modeling. Contrary to most existing deep human motion prediction approaches, the multimodal nature of MDN enables the generation of diverse future motion hypotheses, which well compensates for the strong stochastic ambiguity aggregated by the single input and human motion uncertainty. In designing the loss function, we further introduce the energy-based formulation to flexibly impose prior losses over the learnable parameters of MDN to maintain motion coherence as well as improve the prediction…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · Advanced Vision and Imaging
