ViViD: Video Virtual Try-on using Diffusion Models
Zixun Fang, Wei Zhai, Aimin Su, Hongliang Song, Kai Zhu, Mao Wang, Yu, Chen, Zhiheng Liu, Yang Cao, Zheng-Jun Zha

TL;DR
ViViD introduces a diffusion model-based framework for video virtual try-on, achieving high-quality, temporally consistent videos by integrating garment and pose encoding with hierarchical temporal modules.
Contribution
The paper presents a novel diffusion model framework for video virtual try-on, including a garment encoder, pose encoder, and temporal modules, along with a new diverse high-resolution dataset.
Findings
Achieves high visual quality in video try-on results.
Ensures temporal and spatial consistency in generated videos.
Outperforms previous methods in visual fidelity and coherence.
Abstract
Video virtual try-on aims to transfer a clothing item onto the video of a target person. Directly applying the technique of image-based try-on to the video domain in a frame-wise manner will cause temporal-inconsistent outcomes while previous video-based try-on solutions can only generate low visual quality and blurring results. In this work, we present ViViD, a novel framework employing powerful diffusion models to tackle the task of video virtual try-on. Specifically, we design the Garment Encoder to extract fine-grained clothing semantic features, guiding the model to capture garment details and inject them into the target video through the proposed attention feature fusion mechanism. To ensure spatial-temporal consistency, we introduce a lightweight Pose Encoder to encode pose signals, enabling the model to learn the interactions between clothing and human posture and insert…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis
MethodsDiffusion
