Diffusion-based 3D Hand Motion Recovery with Intuitive Physics

Yufei Zhang; Zijun Cui; Jeffrey O. Kephart; Qiang Ji

arXiv:2508.01835·cs.CV·August 5, 2025

Diffusion-based 3D Hand Motion Recovery with Intuitive Physics

Yufei Zhang, Zijun Cui, Jeffrey O. Kephart, Qiang Ji

PDF

Open Access

TL;DR

This paper introduces a diffusion-based framework for 3D hand motion recovery from videos that incorporates physics knowledge, improving accuracy and temporal coherence without needing annotated video data.

Contribution

The novel physics-augmented diffusion model enhances 3D hand motion sequences using only motion capture data, improving upon existing image-based methods.

Findings

01

Achieves state-of-the-art performance on benchmarks.

02

Significantly improves temporal coherence in hand motion sequences.

03

Effectively integrates physics constraints into the diffusion process.

Abstract

While 3D hand reconstruction from monocular images has made significant progress, generating accurate and temporally coherent motion estimates from videos remains challenging, particularly during hand-object interactions. In this paper, we present a novel 3D hand motion recovery framework that enhances image-based reconstructions through a diffusion-based and physics-augmented motion refinement model. Our model captures the distribution of refined motion estimates conditioned on initial ones, generating improved sequences through an iterative denoising process. Instead of relying on scarce annotated video data, we train our model only using motion capture data without images. We identify valuable intuitive physics knowledge during hand-object interactions, including key motion states and their associated motion constraints. We effectively integrate these physical insights into our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Robot Manipulation and Learning