FaceSnap: Enhanced ID-fidelity Network for Tuning-free Portrait Customization
Benxiang Zhai, Yifang Xu, Guofeng Zhang, Yang Li, and Sidan Du

TL;DR
FaceSnap is a novel portrait customization method based on Stable Diffusion that requires only one reference image, ensuring high fidelity and identity preservation without fine-tuning, and is easily extendable to different models.
Contribution
We introduce FaceSnap, a plug-and-play, tuning-free portrait generation framework with a Facial Attribute Mixer, Landmark Predictor, and ID-preserving module for superior personalized image synthesis.
Findings
Outperforms state-of-the-art methods in personalized portrait generation.
Requires only a single reference image for high-fidelity results.
Maintains identity and facial details across diverse poses.
Abstract
Benefiting from the significant advancements in text-to-image diffusion models, research in personalized image generation, particularly customized portrait generation, has also made great strides recently. However, existing methods either require time-consuming fine-tuning and lack generalizability or fail to achieve high fidelity in facial details. To address these issues, we propose FaceSnap, a novel method based on Stable Diffusion (SD) that requires only a single reference image and produces extremely consistent results in a single inference stage. This method is plug-and-play and can be easily extended to different SD models. Specifically, we design a new Facial Attribute Mixer that can extract comprehensive fused information from both low-level specific features and high-level abstract features, providing better guidance for image generation. We also introduce a Landmark Predictor…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Multimodal Machine Learning Applications
