HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances
Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen, Ishita, Dasgupta, Saayan Mitra, Minh Hoai

TL;DR
HanDiffuser is a novel diffusion-based model that significantly improves the realism of generated hands in text-to-image synthesis by incorporating detailed hand representations and a two-stage generative process.
Contribution
The paper introduces HanDiffuser, a new architecture that explicitly models hand parameters to enhance hand realism in generated images, addressing a key challenge in text-to-image models.
Findings
Achieves high-quality, realistic hand generation in images.
Outperforms existing models in quantitative and qualitative evaluations.
User studies confirm improved hand realism.
Abstract
Text-to-image generative models can generate high-quality humans, but realism is lost when generating hands. Common artifacts include irregular hand poses, shapes, incorrect numbers of fingers, and physically implausible finger orientations. To generate images with realistic hands, we propose a novel diffusion-based architecture called HanDiffuser that achieves realism by injecting hand embeddings in the generative process. HanDiffuser consists of two components: a Text-to-Hand-Params diffusion model to generate SMPL-Body and MANO-Hand parameters from input text prompts, and a Text-Guided Hand-Params-to-Image diffusion model to synthesize images by conditioning on the prompts and hand parameters generated by the previous component. We incorporate multiple aspects of hand representation, including 3D shapes and joint-level finger positions, orientations and articulations, for robust…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Handwritten Text Recognition Techniques · Video Analysis and Summarization
MethodsDiffusion
