HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance
Guian Fang, Wenbiao Yan, Yuanfan Guo, Jianhua Han, Zutao Jiang, Hang, Xu, Shengcai Liao, Xiaodan Liang

TL;DR
HumanRefiner introduces a novel coarse-to-fine refinement method for improving the accuracy of human image generation in diffusion models, supported by a large-scale benchmark for detecting and correcting anatomical anomalies.
Contribution
The paper presents HumanRefiner, a new plug-and-play approach for refining human images in diffusion models, and introduces AbHuman, a large-scale benchmark for human anomaly detection.
Findings
Significantly reduces limb distortion in generated human images
Achieves 2.9x improvement over SDXL in limb quality
Outperforms DALL-E 3 in human evaluation metrics
Abstract
Text-to-image diffusion models have significantly advanced in conditional image generation. However, these models usually struggle with accurately rendering images featuring humans, resulting in distorted limbs and other anomalies. This issue primarily stems from the insufficient recognition and evaluation of limb qualities in diffusion models. To address this issue, we introduce AbHuman, the first large-scale synthesized human benchmark focusing on anatomical anomalies. This benchmark consists of 56K synthesized human images, each annotated with detailed, bounding-box level labels identifying 147K human anomalies in 18 different categories. Based on this, the recognition of human anomalies can be established, which in turn enhances image generation through traditional techniques such as negative prompting and guidance. To further boost the improvement, we propose HumanRefiner, a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAugmented Reality Applications · Interactive and Immersive Displays · Robot Manipulation and Learning
MethodsDiffusion
