FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation
Kefan Chen, Chaerin Min, Linguang Zhang, Shreyas Hampali, Cem Keskin,, Srinath Sridhar

TL;DR
FoundHand is a large-scale, domain-specific diffusion model that enables realistic, controllable hand image synthesis with precise pose, appearance, and viewpoint control, advancing the state-of-the-art in hand image generation.
Contribution
The paper introduces FoundHand, a novel large-scale hand dataset and a diffusion model that uses 2D keypoints for controllable, realistic hand image synthesis with zero-shot correction capabilities.
Findings
State-of-the-art hand image synthesis performance
Effective pose and appearance control via 2D keypoints
Ability to fix malformed hands and generate hand videos
Abstract
Despite remarkable progress in image generation models, generating realistic hands remains a persistent challenge due to their complex articulation, varying viewpoints, and frequent occlusions. We present FoundHand, a large-scale domain-specific diffusion model for synthesizing single and dual hand images. To train our model, we introduce FoundHand-10M, a large-scale hand dataset with 2D keypoints and segmentation mask annotations. Our insight is to use 2D hand keypoints as a universal representation that encodes both hand articulation and camera viewpoint. FoundHand learns from image pairs to capture physically plausible hand articulations, natively enables precise control through 2D keypoints, and supports appearance control. Our model exhibits core capabilities that include the ability to repose hands, transfer hand appearance, and even synthesize novel views. This leads to zero-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHand Gesture Recognition Systems · Human Pose and Action Recognition · Face recognition and analysis
MethodsDiffusion
