Fashion-VDM: Video Diffusion Model for Virtual Try-On
Johanna Karras, Yingwei Li, Nan Liu, Luyang Zhu, Innfarn Yoo, Andreas, Lugmayr, Chris Lee, Ira Kemelmacher-Shlizerman

TL;DR
Fashion-VDM introduces a novel diffusion-based video virtual try-on model that produces high-quality, temporally consistent videos of garments on persons, outperforming existing methods in detail preservation and control.
Contribution
The paper presents a diffusion-based architecture with split classifier-free guidance and progressive temporal training for improved video virtual try-on quality and control.
Findings
Sets new state-of-the-art in video virtual try-on quality.
Effective joint image-video training enhances performance with limited video data.
Achieves high-resolution, temporally consistent try-on videos.
Abstract
We present Fashion-VDM, a video diffusion model (VDM) for generating virtual try-on videos. Given an input garment image and person video, our method aims to generate a high-quality try-on video of the person wearing the given garment, while preserving the person's identity and motion. Image-based virtual try-on has shown impressive results; however, existing video virtual try-on (VVT) methods are still lacking garment details and temporal consistency. To address these issues, we propose a diffusion-based architecture for video virtual try-on, split classifier-free guidance for increased control over the conditioning inputs, and a progressive temporal training strategy for single-pass 64-frame, 512px video generation. We also demonstrate the effectiveness of joint image-video training for video try-on, especially when video data is limited. Our qualitative and quantitative experiments…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimedia Communication and Technology
MethodsDiffusion
