DreamMatcher: Appearance Matching Self-Attention for   Semantically-Consistent Text-to-Image Personalization

Jisu Nam; Heesu Kim; DongJae Lee; Siyoon Jin; Seungryong Kim; Seunggyu; Chang

arXiv:2402.09812·cs.CV·April 24, 2024·1 cites

DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization

Jisu Nam, Heesu Kim, DongJae Lee, Siyoon Jin, Seungryong Kim, Seunggyu, Chang

PDF

Open Access 1 Repo

TL;DR

DreamMatcher is a novel method for text-to-image personalization that uses appearance matching self-attention to improve semantic consistency and diversity in generated images, without disrupting the pre-trained model's structure.

Contribution

It introduces a semantic matching-based plug-in approach for T2I personalization that preserves model structure and enhances appearance accuracy.

Findings

01

Significant improvements in complex personalization scenarios.

02

Effective preservation of diverse image structures.

03

Enhanced semantic consistency in generated images.

Abstract

The objective of text-to-image (T2I) personalization is to customize a diffusion model to a user-provided reference concept, generating diverse images of the concept aligned with the target prompts. Conventional methods representing the reference concepts using unique text embeddings often fail to accurately mimic the appearance of the reference. To address this, one solution may be explicitly conditioning the reference images into the target denoising process, known as key-value replacement. However, prior works are constrained to local editing since they disrupt the structure path of the pre-trained T2I model. To overcome this, we propose a novel plug-in method, called DreamMatcher, which reformulates T2I personalization as semantic matching. Specifically, DreamMatcher replaces the target values with reference values aligned by semantic matching, while leaving the structure path…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KU-CVLAB/DreamMatcher
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Image Retrieval and Classification Techniques · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion