BridgeDiff: Bridging Human Observations and Flat-Garment Synthesis for Virtual Try-Off
Shuang Liu, Ao Yu, Linkang Cheng, Xiwen Huang, Li Zhao, Junhui Liu, Zhiting Lin, Yu Liu

TL;DR
BridgeDiff is a diffusion-based framework that improves virtual try-off by explicitly connecting human observations with flat-garment synthesis, resulting in more accurate and stable garment reconstructions.
Contribution
It introduces a novel approach with two modules that bridge human-centric data and flat-garment synthesis, enhancing structural stability and detail inference.
Findings
Achieves state-of-the-art results on standard benchmarks.
Produces higher-quality, structurally stable flat-garment reconstructions.
Effectively preserves fine-grained appearance details.
Abstract
Virtual try-off (VTOFF) aims to recover canonical flat-garment representations from images of dressed persons for standardized display and downstream virtual try-on. Prior methods often treat VTOFF as direct image translation driven by local masks or text-only prompts, overlooking the gap between on-body appearances and flat layouts. This gap frequently leads to inconsistent completion in unobserved regions and unstable garment structure. We propose BridgeDiff, a diffusion-based framework that explicitly bridges human-centric observations and flat-garment synthesis through two complementary components. First, the Garment Condition Bridge Module (GCBM) builds a garment-cue representation that captures global appearance and semantic identity, enabling robust inference of continuous details under partial visibility. Second, the Flat Structure Constraint Module (FSCM) injects explicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Face recognition and analysis
