BooW-VTON: Boosting In-the-Wild Virtual Try-On via Mask-Free Pseudo Data   Training

Xuanpu Zhang; Dan Song; Pengxin Zhan; Tianyu Chang and; Jianhao Zeng; Qingguo Chen; Weihua Luo; Anan Liu

arXiv:2408.06047·cs.CV·November 25, 2024

BooW-VTON: Boosting In-the-Wild Virtual Try-On via Mask-Free Pseudo Data Training

Xuanpu Zhang, Dan Song, Pengxin Zhan, Tianyu Chang and, Jianhao Zeng, Qingguo Chen, Weihua Luo, Anan Liu

PDF

Open Access 1 Repo

TL;DR

BooW-VTON introduces a mask-free diffusion model for virtual try-on that leverages pseudo-data and in-the-wild augmentation to produce high-quality, artifact-free try-on images without requiring complex masking or parsing.

Contribution

The paper presents a novel mask-free training paradigm for virtual try-on, utilizing pseudo-data and data augmentation to improve realism and robustness in wild scenarios.

Findings

01

Outperforms existing methods in wild scenarios

02

Produces high-quality, artifact-free try-on images

03

Operates without garment parsing cost

Abstract

Image-based virtual try-on is an increasingly popular and important task to generate realistic try-on images of the specific person. Recent methods model virtual try-on as image mask-inpaint task, which requires masking the person image and results in significant loss of spatial information. Especially, for in-the-wild try-on scenarios with complex poses and occlusions, mask-based methods often introduce noticeable artifacts. Our research found that a mask-free approach can fully leverage spatial and lighting information from the original person image, enabling high-quality virtual try-on. Consequently, we propose a novel training paradigm for a mask-free try-on diffusion model. We ensure the model's mask-free try-on capability by creating high-quality pseudo-data and further enhance its handling of complex spatial information through effective in-the-wild data augmentation. Besides, a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

little-misfit/boow-vton
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Industrial Vision Systems and Defect Detection · Face recognition and analysis

MethodsDiffusion · Inpainting