T2I-Adapter: Learning Adapters to Dig out More Controllable Ability for Text-to-Image Diffusion Models
Chong Mou, Xintao Wang, Liangbin Xie, Yanze Wu, Jian Zhang, Zhongang, Qi, Ying Shan, Xiaohu Qie

TL;DR
This paper introduces lightweight T2I-Adapters that align with large-scale text-to-image models to enable more precise control over generated images, such as color and structure, without retraining the entire model.
Contribution
The paper proposes a novel method to explicitly control T2I models using lightweight adapters, enhancing controllability and editing capabilities while preserving the original model's knowledge.
Findings
Adapters enable granular control over color and structure.
The method maintains high generation quality.
Adapters are versatile and can be combined for complex editing.
Abstract
The incredible generative ability of large-scale text-to-image (T2I) models has demonstrated strong power of learning complex structures and meaningful semantics. However, relying solely on text prompts cannot fully take advantage of the knowledge learned by the model, especially when flexible and accurate controlling (e.g., color and structure) is needed. In this paper, we aim to ``dig out" the capabilities that T2I models have implicitly learned, and then explicitly use them to control the generation more granularly. Specifically, we propose to learn simple and lightweight T2I-Adapters to align internal knowledge in T2I models with external control signals, while freezing the original large T2I models. In this way, we can train various adapters according to different conditions, achieving rich control and editing effects in the color and structure of the generation results. Further,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗TencentARC/T2I-Adaptermodel· ♡ 845♡ 845
- 🤗TencentARC/t2iadapter_canny_sd14v1model· 16 dl· ♡ 116 dl♡ 1
- 🤗TencentARC/t2iadapter_canny_sd15v2model· 1.9k dl· ♡ 31.9k dl♡ 3
- 🤗TencentARC/t2iadapter_color_sd14v1model· 154 dl· ♡ 12154 dl♡ 12
- 🤗TencentARC/t2iadapter_depth_sd14v1model· 3 dl3 dl
- 🤗TencentARC/t2iadapter_depth_sd15v2model· 1.8k dl· ♡ 31.8k dl♡ 3
- 🤗TencentARC/t2iadapter_openpose_sd14v1model· 12 dl· ♡ 112 dl♡ 1
- 🤗TencentARC/t2iadapter_seg_sd14v1model· 12 dl12 dl
- 🤗TencentARC/t2iadapter_sketch_sd14v1model· 11 dl· ♡ 111 dl♡ 1
- 🤗TencentARC/t2iadapter_sketch_sd15v2model· 1.9k dl· ♡ 61.9k dl♡ 6
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis
MethodsALIGN
