# PHD: Personalized 3D Human Body Fitting with Point Diffusion

**Authors:** Hsuan-I Ho, Chen Guo, Po-Chen Wu, Ivan Shugurov, Chengcheng Tang, Abhay Mittal, Sizhe An, Manuel Kaufmann, Linguang Zhang

arXiv: 2508.21257 · 2025-09-01

## TL;DR

PHD introduces a personalized 3D human body fitting method that uses a shape-conditioned point diffusion model to improve pose accuracy from videos, requiring only synthetic data and integrating easily with existing systems.

## Contribution

The paper presents a novel shape-conditioned 3D pose prior using a Point Diffusion Transformer, enhancing pose estimation accuracy by incorporating user-specific body shape information.

## Key findings

- Improves pelvis-aligned and absolute pose accuracy.
- Highly data-efficient, trained solely on synthetic data.
- Can be integrated with existing 3D pose estimators.

## Abstract

We introduce PHD, a novel approach for personalized 3D human mesh recovery (HMR) and body fitting that leverages user-specific shape information to improve pose estimation accuracy from videos. Traditional HMR methods are designed to be user-agnostic and optimized for generalization. While these methods often refine poses using constraints derived from the 2D image to improve alignment, this process compromises 3D accuracy by failing to jointly account for person-specific body shapes and the plausibility of 3D poses. In contrast, our pipeline decouples this process by first calibrating the user's body shape and then employing a personalized pose fitting process conditioned on that shape. To achieve this, we develop a body shape-conditioned 3D pose prior, implemented as a Point Diffusion Transformer, which iteratively guides the pose fitting via a Point Distillation Sampling loss. This learned 3D pose prior effectively mitigates errors arising from an over-reliance on 2D constraints. Consequently, our approach improves not only pelvis-aligned pose accuracy but also absolute pose accuracy -- an important metric often overlooked by prior work. Furthermore, our method is highly data-efficient, requiring only synthetic data for training, and serves as a versatile plug-and-play module that can be seamlessly integrated with existing 3D pose estimators to enhance their performance. Project page: https://phd-pose.github.io/

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2508.21257/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/2508.21257/full.md

## References

81 references — full list in the complete paper: https://tomesphere.com/paper/2508.21257/full.md

---
Source: https://tomesphere.com/paper/2508.21257