A generic diffusion-based approach for 3D human pose prediction in the wild
Saeed Saadatnejad, Ali Rasekh, Mohammadreza Mofayezi, Yasamin, Medghalchi, Sara Rajabzadeh, Taylor Mordan, Alexandre Alahi

TL;DR
This paper introduces a diffusion-based method for 3D human pose prediction that effectively handles noisy inputs and occlusions, outperforming existing methods on multiple datasets and enhancing other models through pre- and post-processing.
Contribution
The paper presents a novel diffusion-based framework for 3D human pose prediction that manages noisy data and occlusions, with a temporal cascaded model for long-term forecasting.
Findings
Outperforms state-of-the-art on four datasets
Handles noisy inputs and occlusions effectively
Improves existing models as a pre- or post-processing step
Abstract
Predicting 3D human poses in real-world scenarios, also known as human pose forecasting, is inevitably subject to noisy inputs arising from inaccurate 3D pose estimations and occlusions. To address these challenges, we propose a diffusion-based approach that can predict given noisy observations. We frame the prediction task as a denoising problem, where both observation and prediction are considered as a single sequence containing missing elements (whether in the observation or prediction horizon). All missing elements are treated as noise and denoised with our conditional diffusion model. To better handle long-term forecasting horizon, we present a temporal cascaded diffusion model. We demonstrate the benefits of our approach on four publicly available datasets (Human3.6M, HumanEva-I, AMASS, and 3DPW), outperforming the state-of-the-art. Additionally, we show that our framework is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · 3D Shape Modeling and Analysis · Human Motion and Animation
MethodsRepair · Diffusion
