BridgeDiff: Bridging Human Observations and Flat-Garment Synthesis for Virtual Try-Off

Shuang Liu; Ao Yu; Linkang Cheng; Xiwen Huang; Li Zhao; Junhui Liu; Zhiting Lin; Yu Liu

arXiv:2603.09236·cs.CV·March 11, 2026

BridgeDiff: Bridging Human Observations and Flat-Garment Synthesis for Virtual Try-Off

Shuang Liu, Ao Yu, Linkang Cheng, Xiwen Huang, Li Zhao, Junhui Liu, Zhiting Lin, Yu Liu

PDF

Open Access

TL;DR

BridgeDiff is a diffusion-based framework that improves virtual try-off by explicitly connecting human observations with flat-garment synthesis, resulting in more accurate and stable garment reconstructions.

Contribution

It introduces a novel approach with two modules that bridge human-centric data and flat-garment synthesis, enhancing structural stability and detail inference.

Findings

01

Achieves state-of-the-art results on standard benchmarks.

02

Produces higher-quality, structurally stable flat-garment reconstructions.

03

Effectively preserves fine-grained appearance details.

Abstract

Virtual try-off (VTOFF) aims to recover canonical flat-garment representations from images of dressed persons for standardized display and downstream virtual try-on. Prior methods often treat VTOFF as direct image translation driven by local masks or text-only prompts, overlooking the gap between on-body appearances and flat layouts. This gap frequently leads to inconsistent completion in unobserved regions and unstable garment structure. We propose BridgeDiff, a diffusion-based framework that explicitly bridges human-centric observations and flat-garment synthesis through two complementary components. First, the Garment Condition Bridge Module (GCBM) builds a garment-cue representation that captures global appearance and semantic identity, enabling robust inference of continuous details under partial visibility. Second, the Flat Structure Constraint Module (FSCM) injects explicit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Face recognition and analysis