Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models

Shihao Zhao; Dongdong Chen; Yen-Chun Chen; Jianmin Bao and; Shaozhe Hao; Lu Yuan; Kwan-Yee K. Wong

arXiv:2305.16322·cs.CV·October 31, 2023·65 cites

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models

Shihao Zhao, Dongdong Chen, Yen-Chun Chen, Jianmin Bao and, Shaozhe Hao, Lu Yuan, Kwan-Yee K. Wong

PDF

Open Access 1 Repo 1 Video

TL;DR

Uni-ControlNet is a versatile framework that enables simultaneous use of multiple local and global controls in text-to-image diffusion models, improving controllability and efficiency with minimal fine-tuning.

Contribution

It introduces a unified, composable control framework requiring only two adapters, reducing fine-tuning costs and enhancing control over image generation.

Findings

01

Outperforms existing methods in controllability and quality.

02

Requires only two adapters regardless of control types.

03

Demonstrates superior composability and efficiency.

Abstract

Text-to-Image diffusion models have made tremendous progress over the past two years, enabling the generation of highly realistic images based on open-domain text descriptions. However, despite their success, text descriptions often struggle to adequately convey detailed controls, even when composed of long and complex texts. Moreover, recent studies have also shown that these models face challenges in understanding such complex texts and generating the corresponding images. Therefore, there is a growing need to enable more control modes beyond text description. In this paper, we introduce Uni-ControlNet, a unified framework that allows for the simultaneous utilization of different local controls (e.g., edge maps, depth map, segmentation masks) and global controls (e.g., CLIP image embeddings) in a flexible and composable manner within one single model. Unlike existing methods,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shihaozhaozsh/uni-controlnet
pytorchOfficial

Videos

Uni-ControlNet: All-in-One Control to Text-to-Image Diffusion Models· slideslive

Taxonomy

TopicsMycobacterium research and diagnosis · Advanced Neuroimaging Techniques and Applications · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion · Contrastive Language-Image Pre-training · Adapter