DreamBooth: Fine Tuning Text-to-Image Diffusion Models for   Subject-Driven Generation

Nataniel Ruiz; Yuanzhen Li; Varun Jampani; Yael Pritch; Michael; Rubinstein; Kfir Aberman

arXiv:2208.12242·cs.CV·March 16, 2023·108 cites

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Nataniel Ruiz, Yuanzhen Li, Varun Jampani, Yael Pritch, Michael, Rubinstein, Kfir Aberman

PDF

Open Access 5 Repos 10 Models 5 Datasets

TL;DR

DreamBooth introduces a fine-tuning method for text-to-image diffusion models that personalizes them to generate photorealistic images of specific subjects in diverse contexts using only a few reference images.

Contribution

The paper presents a novel fine-tuning approach with a class-specific prior preservation loss for subject-driven image generation, enabling high-quality personalization with minimal data.

Findings

01

Effective subject recontextualization and view synthesis.

02

Preserves key features of subjects across diverse scenes.

03

Outperforms previous methods in personalized image generation.

Abstract

Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt. However, these models lack the ability to mimic the appearance of subjects in a given reference set and synthesize novel renditions of them in different contexts. In this work, we present a new approach for "personalization" of text-to-image diffusion models. Given as input just a few images of a subject, we fine-tune a pretrained text-to-image model such that it learns to bind a unique identifier with that specific subject. Once the subject is embedded in the output domain of the model, the unique identifier can be used to synthesize novel photorealistic images of the subject contextualized in different scenes. By leveraging the semantic prior embedded in the model with a new autogenous class-specific prior preservation loss, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Image Processing and 3D Reconstruction

MethodsDiffusion