DanceHMR: Hand-Aware Whole-Body Human Mesh Recovery from Monocular Videos

Wenhao Shen; Ming Zhou; Hengyuan Zhang; Siyuan Bian; Youjiang Xu; Xi Lin

arXiv:2605.18102·cs.CV·May 22, 2026

DanceHMR: Hand-Aware Whole-Body Human Mesh Recovery from Monocular Videos

Wenhao Shen, Ming Zhou, Hengyuan Zhang, Siyuan Bian, Youjiang Xu, Xi Lin

PDF

TL;DR

DanceHMR is a novel framework that achieves temporally stable, detailed whole-body human mesh recovery from monocular videos, effectively capturing hand articulation and body motion in challenging real-world scenarios.

Contribution

It introduces a unified, temporally coherent model with residual body-hand fusion and close-up-aware augmentation for improved hand and body mesh recovery.

Findings

01

Enhanced hand articulation recovery compared to prior methods.

02

Achieved stable, temporally consistent SMPL-X mesh motion in real-world videos.

03

Demonstrated competitive accuracy on benchmark datasets.

Abstract

Monocular video human mesh recovery is essential for digital humans, avatar animation, and embodied simulation, where both temporal stability and expressive whole-body motion are required. Existing video HMR methods produce coherent body motion but often overlook detailed hand articulation, while image-based whole-body methods recover SMPL-X meshes independently per frame, often leading to jittery and inaccurate hand motion. We present a temporally coherent whole-body HMR framework for challenging in-the-wild monocular videos. Our model unifies body context and part-specific hand observations through residual body-hand fusion, enabling stable body motion and detailed hand recovery within a single temporal architecture. We further introduce close-up-aware augmentation to improve robustness under upper-body framing. Experiments on whole-body and body-only benchmarks demonstrate improved…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.