Diff-PC: Identity-preserving and 3D-aware Controllable Diffusion for Zero-shot Portrait Customization
Yifang Xu, Benxiang Zhai, Chenyu Zhang, Ming Li, Yang Li, Sidan Du

TL;DR
Diff-PC is a diffusion-based framework enabling zero-shot portrait customization with high identity preservation, precise facial control, and diverse backgrounds, leveraging 3D face priors and novel encoding techniques.
Contribution
We introduce Diff-PC, a novel diffusion-based approach that integrates 3D face priors and specialized encoders for improved identity fidelity and controllability in portrait editing.
Findings
Outperforms state-of-the-art in identity preservation and facial control.
Achieves high T2I alignment and background diversity.
Compatible with multi-style foundation models.
Abstract
Portrait customization (PC) has recently garnered significant attention due to its potential applications. However, existing PC methods lack precise identity (ID) preservation and face control. To address these tissues, we propose Diff-PC, a diffusion-based framework for zero-shot PC, which generates realistic portraits with high ID fidelity, specified facial attributes, and diverse backgrounds. Specifically, our approach employs the 3D face predictor to reconstruct the 3D-aware facial priors encompassing the reference ID, target expressions, and poses. To capture fine-grained face details, we design ID-Encoder that fuses local and global facial features. Subsequently, we devise ID-Ctrl using the 3D face to guide the alignment of ID features. We further introduce ID-Injector to enhance ID fidelity and facial controllability. Finally, training on our collected ID-centric dataset improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Face Recognition and Perception
