Enhancing Hands in 3D Whole-Body Pose Estimation with Conditional Hands Modulator
Gyeongsik Moon

TL;DR
This paper introduces Hand4Whole++, a modular framework that combines pre-trained hand and whole-body pose estimators using a novel Conditional Hands Modulator to improve hand and full-body pose accuracy in 3D estimation.
Contribution
We propose CHAM, a lightweight module that modulates whole-body features with hand-specific information, enabling better hand and body pose predictions without retraining the entire model.
Findings
Significant improvement in hand pose accuracy.
Enhanced coherence of wrist orientations with upper-body kinematics.
Better overall full-body pose quality.
Abstract
Accurately recovering hand poses within the body context remains a major challenge in 3D whole-body pose estimation. This difficulty arises from a fundamental supervision gap: whole-body pose estimators are trained on full-body datasets with limited hand diversity, while hand-only estimators, trained on hand-centric datasets, excel at detailed finger articulation but lack global body awareness. To address this, we propose Hand4Whole++, a modular framework that leverages the strengths of both pre-trained whole-body and hand pose estimators. We introduce CHAM (Conditional Hands Modulator), a lightweight module that modulates the whole-body feature stream using hand-specific features extracted from a pre-trained hand pose estimator. This modulation enables the whole-body model to predict wrist orientations that are both accurate and coherent with the upper-body kinematic structure, without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Robot Manipulation and Learning
