UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human   Generation

Jianglin Fu; Shikai Li; Yuming Jiang; Kwan-Yee Lin; Wayne Wu; Ziwei; Liu

arXiv:2309.14335·cs.CV·September 26, 2023·1 cites

UnitedHuman: Harnessing Multi-Source Data for High-Resolution Human Generation

Jianglin Fu, Shikai Li, Yuming Jiang, Kwan-Yee Lin, Wayne Wu, Ziwei, Liu

PDF

Open Access 1 Repo

TL;DR

UnitedHuman introduces an end-to-end framework that leverages multi-source, multi-resolution datasets to improve high-resolution human image generation, effectively addressing local detail synthesis issues.

Contribution

The paper proposes a novel Multi-Source Spatial Transformer and a continuous GAN framework to align and utilize diverse datasets for enhanced human image synthesis.

Findings

01

Achieves higher quality human images than holistic dataset methods

02

Effectively aligns multi-source images with a human model

03

Demonstrates superior performance through extensive experiments

Abstract

Human generation has achieved significant progress. Nonetheless, existing methods still struggle to synthesize specific regions such as faces and hands. We argue that the main reason is rooted in the training data. A holistic human dataset inevitably has insufficient and low-resolution information on local parts. Therefore, we propose to use multi-source datasets with various resolution images to jointly learn a high-resolution human generative model. However, multi-source data inherently a) contains different parts that do not spatially align into a coherent human, and b) comes with different scales. To tackle these challenges, we propose an end-to-end framework, UnitedHuman, that empowers continuous GAN with the ability to effectively utilize multi-source data for high-resolution human generation. Specifically, 1) we design a Multi-Source Spatial Transformer that spatially aligns…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

unitedhuman/unitedhuman
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Human Pose and Action Recognition

MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · CutMix · Label Smoothing · Dropout · ALIGN · Byte Pair Encoding · Absolute Position Encodings · Dense Connections