3D Pose Estimation and Future Motion Prediction from 2D Images
Ji Yang, Youdong Ma, Xinxin Zuo, Sen Wang, Minglun Gong, Li Cheng

TL;DR
This paper introduces PoseMoNet, a joint framework for 3D human pose estimation and future motion prediction from 2D images, utilizing Lie algebra representations and a multi-task sequence-to-sequence architecture to improve accuracy.
Contribution
The paper proposes a novel self-projection mechanism and a global refinement module within a multi-task encoder-decoder framework for joint 3D pose estimation and motion prediction.
Findings
Achieves competitive results on Human3.6M and HumanEva-I benchmarks.
Demonstrates the effectiveness of the self-projection mechanism.
Shows that joint modeling improves performance over separate tasks.
Abstract
This paper considers to jointly tackle the highly correlated tasks of estimating 3D human body poses and predicting future 3D motions from RGB image sequences. Based on Lie algebra pose representation, a novel self-projection mechanism is proposed that naturally preserves human motion kinematics. This is further facilitated by a sequence-to-sequence multi-task architecture based on an encoder-decoder topology, which enables us to tap into the common ground shared by both tasks. Finally, a global refinement module is proposed to boost the performance of our framework. The effectiveness of our approach, called PoseMoNet, is demonstrated by ablation tests and empirical evaluations on Human3.6M and HumanEva-I benchmark, where competitive performance is obtained comparing to the state-of-the-arts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Diabetic Foot Ulcer Assessment and Management
