RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency
Siqi Li, Zhengkai Jiang, Jiawei Zhou, Zhihong Liu, Xiaowei Chi,, Haoqian Wang

TL;DR
RealVVT introduces a novel framework for photorealistic video virtual try-on that ensures spatial and temporal consistency, significantly improving realism and stability in extended video sequences for fashion applications.
Contribution
The paper presents a new video virtual try-on method leveraging foundation models, with innovative strategies for consistency and pose-guided long video handling, advancing beyond prior single-image approaches.
Findings
Outperforms state-of-the-art in video and image VTO tasks
Enhances stability and realism in extended video sequences
Effective in practical fashion e-commerce applications
Abstract
Virtual try-on has emerged as a pivotal task at the intersection of computer vision and fashion, aimed at digitally simulating how clothing items fit on the human body. Despite notable progress in single-image virtual try-on (VTO), current methodologies often struggle to preserve a consistent and authentic appearance of clothing across extended video sequences. This challenge arises from the complexities of capturing dynamic human pose and maintaining target clothing characteristics. We leverage pre-existing video foundation models to introduce RealVVT, a photoRealistic Video Virtual Try-on framework tailored to bolster stability and realism within dynamic video contexts. Our methodology encompasses a Clothing & Temporal Consistency strategy, an Agnostic-guided Attention Focus Loss mechanism to ensure spatial consistency, and a Pose-guided Long Video VTO technique adept at handling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Advanced Image Processing Techniques
MethodsSoftmax · Attention Is All You Need · Focus
