DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images
Enbo Huang, Yuan Zhang, Faliang Huang, Guangyu Zhang, Yang Liu

TL;DR
This paper introduces DRDM, a diffusion model that generates realistic person images with controllable poses and appearances by disentangling features and guiding the synthesis process, improving detail preservation and reducing distortions.
Contribution
The paper proposes a novel disentangled representations diffusion model with a body-part decoupling block and a parsing map-based guided sampling method for improved person image synthesis.
Findings
Achieves high-quality pose transfer and appearance control.
Reduces limb distortion and garment style deviation.
Demonstrates effectiveness on the Deepfashion dataset.
Abstract
Person image synthesis with controllable body poses and appearances is an essential task owing to the practical needs in the context of virtual try-on, image editing and video production. However, existing methods face significant challenges with details missing, limbs distortion and the garment style deviation. To address these issues, we propose a Disentangled Representations Diffusion Model (DRDM) to generate photo-realistic images from source portraits in specific desired poses and appearances. First, a pose encoder is responsible for encoding pose features into a high-dimensional space to guide the generation of person images. Second, a body-part subspace decoupling block (BSDB) disentangles features from the different body parts of a source figure and feeds them to the various layers of the noise prediction block, thereby supplying the network with rich disentangled features for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction · Digital Media Forensic Detection
MethodsDiffusion
