HumanRefiner: Benchmarking Abnormal Human Generation and Refining with   Coarse-to-fine Pose-Reversible Guidance

Guian Fang; Wenbiao Yan; Yuanfan Guo; Jianhua Han; Zutao Jiang; Hang; Xu; Shengcai Liao; Xiaodan Liang

arXiv:2407.06937·cs.CV·July 10, 2024

HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

Guian Fang, Wenbiao Yan, Yuanfan Guo, Jianhua Han, Zutao Jiang, Hang, Xu, Shengcai Liao, Xiaodan Liang

PDF

Open Access 1 Repo 1 Datasets

TL;DR

HumanRefiner introduces a novel coarse-to-fine refinement method for improving the accuracy of human image generation in diffusion models, supported by a large-scale benchmark for detecting and correcting anatomical anomalies.

Contribution

The paper presents HumanRefiner, a new plug-and-play approach for refining human images in diffusion models, and introduces AbHuman, a large-scale benchmark for human anomaly detection.

Findings

01

Significantly reduces limb distortion in generated human images

02

Achieves 2.9x improvement over SDXL in limb quality

03

Outperforms DALL-E 3 in human evaluation metrics

Abstract

Text-to-image diffusion models have significantly advanced in conditional image generation. However, these models usually struggle with accurately rendering images featuring humans, resulting in distorted limbs and other anomalies. This issue primarily stems from the insufficient recognition and evaluation of limb qualities in diffusion models. To address this issue, we introduce AbHuman, the first large-scale synthesized human benchmark focusing on anatomical anomalies. This benchmark consists of 56K synthesized human images, each annotated with detailed, bounding-box level labels identifying 147K human anomalies in 18 different categories. Based on this, the recognition of human anomalies can be established, which in turn enhances image generation through traditional techniques such as negative prompting and guidance. To further boost the improvement, we propose HumanRefiner, a novel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

enderfga/humanrefiner
noneOfficial

Datasets

Enderfga/HumanRefiner
dataset· 34 dl
34 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAugmented Reality Applications · Interactive and Immersive Displays · Robot Manipulation and Learning

MethodsDiffusion