Multi-Concept Customization of Text-to-Image Diffusion

Nupur Kumari; Bingliang Zhang; Richard Zhang; Eli Shechtman; Jun-Yan; Zhu

arXiv:2212.04488·cs.CV·June 21, 2023·23 cites

Multi-Concept Customization of Text-to-Image Diffusion

Nupur Kumari, Bingliang Zhang, Richard Zhang, Eli Shechtman, Jun-Yan, Zhu

PDF

Open Access 2 Repos

TL;DR

This paper introduces Custom Diffusion, a quick and efficient method to teach text-to-image models new concepts using minimal tuning, enabling the composition of multiple concepts in generated images.

Contribution

It presents a novel approach to rapidly customize text-to-image models for new concepts by optimizing only a few parameters, allowing for fast training and flexible concept composition.

Findings

01

Achieves concept learning in approximately 6 minutes.

02

Enables seamless combination of multiple concepts in generated images.

03

Outperforms or matches existing methods in quality and efficiency.

Abstract

While generative models produce high-quality images of concepts learned from a large-scale database, a user often wishes to synthesize instantiations of their own concepts (for example, their family, pets, or items). Can we teach a model to quickly acquire a new concept, given a few examples? Furthermore, can we compose multiple new concepts together? We propose Custom Diffusion, an efficient method for augmenting existing text-to-image models. We find that only optimizing a few parameters in the text-to-image conditioning mechanism is sufficiently powerful to represent new concepts while enabling fast tuning (~6 minutes). Additionally, we can jointly train for multiple concepts or combine multiple fine-tuned models into one via closed-form constrained optimization. Our fine-tuned model generates variations of multiple new concepts and seamlessly composes them with existing concepts in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion