JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized   Text-to-Image Generation

Yu Zeng; Vishal M. Patel; Haochen Wang; Xun Huang; Ting-Chun Wang,; Ming-Yu Liu; Yogesh Balaji

arXiv:2407.06187·cs.CV·July 9, 2024

JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation

Yu Zeng, Vishal M. Patel, Haochen Wang, Xun Huang, Ting-Chun Wang,, Ming-Yu Liu, Yogesh Balaji

PDF

Open Access

TL;DR

JeDi introduces a finetuning-free method for personalized text-to-image generation by learning joint distributions of related image-text pairs, enabling high-quality, identity-preserving image synthesis using reference images.

Contribution

The paper presents a novel joint-image diffusion approach that eliminates the need for finetuning, improving personalization quality and efficiency in text-to-image models.

Findings

01

Achieves state-of-the-art generation quality.

02

Outperforms prior finetuning-based and free methods.

03

Enables fast personalization with reference images.

Abstract

Personalized text-to-image generation models enable users to create images that depict their individual possessions in diverse scenes, finding applications in various domains. To achieve the personalization capability, existing methods rely on finetuning a text-to-image foundation model on a user's custom dataset, which can be non-trivial for general users, resource-intensive, and time-consuming. Despite attempts to develop finetuning-free methods, their generation quality is much lower compared to their finetuning counterparts. In this paper, we propose Joint-Image Diffusion (\jedi), an effective technique for learning a finetuning-free personalization model. Our key idea is to learn the joint distribution of multiple related text-image pairs that share a common subject. To facilitate learning, we propose a scalable synthetic dataset generation technique. Once trained, our model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Retrieval and Classification Techniques · AI in cancer detection · Video Analysis and Summarization

MethodsDiffusion