TruePose: Human-Parsing-guided Attention Diffusion for Full-ID Preserving Pose Transfer
Zhihong Xu, Dongxia Wang, Peng Du, Yang Cao, Qing Guo

TL;DR
TruePose introduces a novel human-parsing-guided attention diffusion method that significantly improves the preservation of clothing and facial details in pose transfer, addressing limitations of existing diffusion models.
Contribution
The paper proposes a human-parsing-aware Siamese network with fusion and CLIP-guided attention modules to enhance clothing and facial detail preservation in pose-guided image synthesis.
Findings
Outperforms 13 baseline methods in clothing and facial detail preservation
Effective in both in-shop and in-the-wild datasets
Achieves high-quality pose transfer with identity preservation
Abstract
Pose-Guided Person Image Synthesis (PGPIS) generates images that maintain a subject's identity from a source image while adopting a specified target pose (e.g., skeleton). While diffusion-based PGPIS methods effectively preserve facial features during pose transformation, they often struggle to accurately maintain clothing details from the source image throughout the diffusion process. This limitation becomes particularly problematic when there is a substantial difference between the source and target poses, significantly impacting PGPIS applications in the fashion industry where clothing style preservation is crucial for copyright protection. Our analysis reveals that this limitation primarily stems from the conditional diffusion model's attention modules failing to adequately capture and preserve clothing patterns. To address this limitation, we propose human-parsing-guided attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Hand Gesture Recognition Systems · Human Pose and Action Recognition
MethodsSoftmax · Attention Is All You Need · Diffusion · Siamese Network
