PIDiff: Image Customization for Personalized Identities with Diffusion Models

Jinyu Gu; Haipeng Liu; Meng Wang; Yang Wang

arXiv:2505.05081·cs.CV·May 13, 2025

PIDiff: Image Customization for Personalized Identities with Diffusion Models

Jinyu Gu, Haipeng Liu, Meng Wang, Yang Wang

PDF

Open Access

TL;DR

PIDiff is a novel diffusion model fine-tuned with a tailored strategy to accurately generate personalized identity images, effectively disentangling identity from background and enabling style editing.

Contribution

The paper introduces PIDiff, a fine-tuned diffusion model utilizing W+ space and a new training strategy to improve identity localization and disentanglement in personalized image generation.

Findings

01

PIDiff outperforms previous methods in identity preservation.

02

It achieves accurate identity localization in in-the-wild images.

03

Enables effective style editing of personalized identities.

Abstract

Text-to-image generation for personalized identities aims at incorporating the specific identity into images using a text prompt and an identity image. Based on the powerful generative capabilities of DDPMs, many previous works adopt additional prompts, such as text embeddings and CLIP image embeddings, to represent the identity information, while they fail to disentangle the identity information and background information. As a result, the generated images not only lose key identity characteristics but also suffer from significantly reduced diversity. To address this issue, previous works have combined the W+ space from StyleGAN with diffusion models, leveraging this space to provide a more accurate and comprehensive representation of identity features through multi-level feature extraction. However, the entanglement of identity and background information in in-the-wild images during…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Face recognition and analysis

MethodsHuMan(Expedia)||How do I get a human at Expedia? · R1 Regularization · Adaptive Instance Normalization · Convolution · Dense Connections · Feedforward Network · StyleGAN · ADaptive gradient method with the OPTimal convergence rate · Diffusion · Contrastive Language-Image Pre-training