Street TryOn: Learning In-the-Wild Virtual Try-On from Unpaired Person Images
Aiyu Cui, Jay Mahajan, Viraj Shah, Preeti Gomathinayagam, Chang Liu,, Svetlana Lazebnik

TL;DR
This paper introduces a new benchmark and a novel method for in-the-wild virtual try-on that does not require paired training data, enabling realistic garment visualization on casual photos with diverse backgrounds.
Contribution
The work presents a new in-the-wild try-on benchmark and a method that learns from unpaired images, addressing pose variation and background complexity.
Findings
Achieves state-of-the-art results in street try-on tasks.
Performs competitively on standard studio try-on benchmarks.
Effectively handles diverse poses and backgrounds in in-the-wild scenarios.
Abstract
Most virtual try-on research is motivated to serve the fashion business by generating images to demonstrate garments on studio models at a lower cost. However, virtual try-on should be a broader application that also allows customers to visualize garments on themselves using their own casual photos, known as in-the-wild try-on. Unfortunately, the existing methods, which achieve plausible results for studio try-on settings, perform poorly in the in-the-wild context. This is because these methods often require paired images (garment images paired with images of people wearing the same garment) for training. While such paired data is easy to collect from shopping websites for studio settings, it is difficult to obtain for in-the-wild scenes. In this work, we fill the gap by (1) introducing a StreetTryOn benchmark to support in-the-wild virtual try-on applications and (2) proposing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Video Surveillance and Tracking Methods
MethodsSparse Evolutionary Training · Latent Diffusion Model · Focus · Inpainting
