HyperHuman: Hyper-Realistic Human Generation with Latent Structural   Diffusion

Xian Liu; Jian Ren; Aliaksandr Siarohin; Ivan Skorokhodov; Yanyu Li,; Dahua Lin; Xihui Liu; Ziwei Liu; Sergey Tulyakov

arXiv:2310.08579·cs.CV·March 18, 2024·6 cites

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion

Xian Liu, Jian Ren, Aliaksandr Siarohin, Ivan Skorokhodov, Yanyu Li,, Dahua Lin, Xihui Liu, Ziwei Liu, Sergey Tulyakov

PDF

Open Access 1 Video

TL;DR

HyperHuman introduces a unified diffusion-based framework that leverages large-scale human-centric data and structural modeling to generate hyper-realistic, diverse human images with coherent poses and detailed geometry.

Contribution

The paper presents a novel Latent Structural Diffusion Model and a large-scale HumanVerse dataset for improved human image synthesis.

Findings

01

Achieves state-of-the-art realism in human image generation

02

Effectively models structural and appearance correlations

03

Produces diverse human images with coherent poses

Abstract

Despite significant advances in large-scale text-to-image models, achieving hyper-realistic human image generation remains a desirable yet unsolved task. Existing models like Stable Diffusion and DALL-E 2 tend to generate human images with incoherent parts or unnatural poses. To tackle these challenges, our key insight is that human image is inherently structural over multiple granularities, from the coarse-level body skeleton to fine-grained spatial geometry. Therefore, capturing such correlations between the explicit appearance and latent structure in one model is essential to generate coherent and natural human images. To this end, we propose a unified framework, HyperHuman, that generates in-the-wild human images of high realism and diverse layouts. Specifically, 1) we first build a large-scale human-centric dataset, named HumanVerse, which consists of 340M images with comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion· slideslive

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsDiffusion