T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for   Text-to-Image Diffusion Models

Chong Mou; Xintao Wang; Liangbin Xie; Yanze Wu; Jian Zhang; Zhongang; Qi; Ying Shan; Xiaohu Qie

arXiv:2302.08453·cs.CV·March 21, 2023·33 cites

T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models

Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang, Qi, Ying Shan, Xiaohu Qie

PDF

Open Access 2 Repos 10 Models 1 Video

TL;DR

This paper introduces lightweight T2I-Adapters that align with large-scale text-to-image models to enable more precise control over generated images, such as color and structure, without retraining the entire model.

Contribution

The paper proposes a novel method to explicitly control T2I models using lightweight adapters, enhancing controllability and editing capabilities while preserving the original model's knowledge.

Findings

01

Adapters enable granular control over color and structure.

02

The method maintains high generation quality.

03

Adapters are versatile and can be combined for complex editing.

Abstract

The incredible generative ability of large-scale text-to-image (T2I) models has demonstrated strong power of learning complex structures and meaningful semantics. However, relying solely on text prompts cannot fully take advantage of the knowledge learned by the model, especially when flexible and accurate controlling (e.g., color and structure) is needed. In this paper, we aim to ``dig out" the capabilities that T2I models have implicitly learned, and then explicitly use them to control the generation more granularly. Specifically, we propose to learn simple and lightweight T2I-Adapters to align internal knowledge in T2I models with external control signals, while freezing the original large T2I models. In this way, we can train various adapters according to different conditions, achieving rich control and editing effects in the color and structure of the generation results. Further,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

T2I-Adapter: Learning Adapters to Dig Out More Controllable Ability for Text-to-Image Diffusion Models· underline

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsALIGN