DreamBlend: Advancing Personalized Fine-tuning of Text-to-Image Diffusion Models
Shwetha Ram, Tal Neiman, Qianli Feng, Andrew Stuart, Son Tran, Trishul, Chilimbi

TL;DR
DreamBlend introduces a novel inference technique that combines early and late checkpoint features to enhance personalized text-to-image generation, balancing prompt fidelity, subject fidelity, and diversity.
Contribution
The paper proposes DreamBlend, a method that merges features from different fine-tuning checkpoints during inference to improve image quality and diversity.
Findings
Outperforms state-of-the-art fine-tuning methods in personalized image generation.
Achieves better balance of prompt fidelity, subject fidelity, and diversity.
Enables high-quality image synthesis on challenging prompts.
Abstract
Given a small number of images of a subject, personalized image generation techniques can fine-tune large pre-trained text-to-image diffusion models to generate images of the subject in novel contexts, conditioned on text prompts. In doing so, a trade-off is made between prompt fidelity, subject fidelity and diversity. As the pre-trained model is fine-tuned, earlier checkpoints synthesize images with low subject fidelity but high prompt fidelity and diversity. In contrast, later checkpoints generate images with low prompt fidelity and diversity but high subject fidelity. This inherent trade-off limits the prompt fidelity, subject fidelity and diversity of generated images. In this work, we propose DreamBlend to combine the prompt fidelity from earlier checkpoints and the subject fidelity from later checkpoints during inference. We perform a cross attention guided image synthesis from a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Biomedical Text Mining and Ontologies · 3D Modeling in Geospatial Applications
MethodsSoftmax · Attention Is All You Need · Diffusion
