Fusion is all you need: Face Fusion for Customized Identity-Preserving Image Synthesis
Salaheldin Mohamed, Dong Han, Yong Li

TL;DR
This paper introduces a novel face fusion technique using the UNet architecture from Stable Diffusion, enabling high-quality, identity-preserving image synthesis that surpasses previous methods in accuracy and robustness.
Contribution
We propose a new face fusion method that directly incorporates reference images into the UNet's cross-attention layers, improving identity preservation and multi-reference generation in text-to-image models.
Findings
Achieves state-of-the-art similarity metrics in identity preservation.
Enhances robustness and consistency across diverse reference images.
Facilitates efficient multi-identity image synthesis.
Abstract
Text-to-image (T2I) models have significantly advanced the development of artificial intelligence, enabling the generation of high-quality images in diverse contexts based on specific text prompts. However, existing T2I-based methods often struggle to accurately reproduce the appearance of individuals from a reference image and to create novel representations of those individuals in various settings. To address this, we leverage the pre-trained UNet from Stable Diffusion to incorporate the target face image directly into the generation process. Our approach diverges from prior methods that depend on fixed encoders or static face embeddings, which often fail to bridge encoding gaps. Instead, we capitalize on UNet's sophisticated encoding capabilities to process reference images across multiple scales. By innovatively altering the cross-attention layers of the UNet, we effectively fuse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Face recognition and analysis · Image Retrieval and Classification Techniques
MethodsDiffusion
