What Matters in Virtual Try-Off? Dual-UNet Diffusion Model For Garment Reconstruction
Loc-Phat Truong, Meysam Madadi, Sergio Escalera

TL;DR
This paper introduces a Dual-UNet Diffusion Model for garment reconstruction in Virtual Try-Off, achieving state-of-the-art results by analyzing various design strategies and training methods.
Contribution
It adapts diffusion-based strategies for VTOFF, establishing a robust architecture and providing comprehensive analysis and strong baseline results.
Findings
Achieves 9.5% improvement on DISTS metric.
Analyzes effects of different diffusion backbones and conditioning methods.
Provides insights into training strategies for garment reconstruction.
Abstract
Virtual Try-On (VTON) has seen rapid advancements, providing a strong foundation for generative fashion tasks. However, the inverse problem, Virtual Try-Off (VTOFF)-aimed at reconstructing the canonical garment from a draped-on image-remains a less understood domain, distinct from the heavily researched field of VTON. In this work, we seek to establish a robust architectural foundation for VTOFF by studying and adapting various diffusion-based strategies from VTON and general Latent Diffusion Models (LDMs). We focus our investigation on the Dual-UNet Diffusion Model architecture and analyze three axes of design: (i) Generation Backbone: comparing Stable Diffusion variants; (ii) Conditioning: ablating different mask designs, masked/unmasked inputs for image conditioning, and the utility of high-level semantic features; and (iii) Losses and Training Strategies: evaluating the impact of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
