What Matters in Virtual Try-Off? Dual-UNet Diffusion Model For Garment Reconstruction

Loc-Phat Truong; Meysam Madadi; Sergio Escalera

arXiv:2604.08716·cs.CV·April 13, 2026

What Matters in Virtual Try-Off? Dual-UNet Diffusion Model For Garment Reconstruction

Loc-Phat Truong, Meysam Madadi, Sergio Escalera

PDF

TL;DR

This paper introduces a Dual-UNet Diffusion Model for garment reconstruction in Virtual Try-Off, achieving state-of-the-art results by analyzing various design strategies and training methods.

Contribution

It adapts diffusion-based strategies for VTOFF, establishing a robust architecture and providing comprehensive analysis and strong baseline results.

Findings

01

Achieves 9.5% improvement on DISTS metric.

02

Analyzes effects of different diffusion backbones and conditioning methods.

03

Provides insights into training strategies for garment reconstruction.

Abstract

Virtual Try-On (VTON) has seen rapid advancements, providing a strong foundation for generative fashion tasks. However, the inverse problem, Virtual Try-Off (VTOFF)-aimed at reconstructing the canonical garment from a draped-on image-remains a less understood domain, distinct from the heavily researched field of VTON. In this work, we seek to establish a robust architectural foundation for VTOFF by studying and adapting various diffusion-based strategies from VTON and general Latent Diffusion Models (LDMs). We focus our investigation on the Dual-UNet Diffusion Model architecture and analyze three axes of design: (i) Generation Backbone: comparing Stable Diffusion variants; (ii) Conditioning: ablating different mask designs, masked/unmasked inputs for image conditioning, and the utility of high-level semantic features; and (iii) Losses and Training Strategies: evaluating the impact of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.