FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video Synthesis from Static Imagery
Tasin Islam, Alina Miron, XiaoHui Liu, Yongmin Li

TL;DR
FashionFlow is a novel diffusion-based method that converts static fashion images into high-quality, dynamic videos, enhancing online shopping experiences by showcasing garments from multiple angles.
Contribution
The paper introduces FashionFlow, a diffusion model-based framework that synthesizes realistic fashion videos from still images using pseudo-3D convolutions and encoder features.
Findings
High-fidelity fashion video generation from images
Effective use of VAE and CLIP encoders for conditioning
Potential to improve online fashion retail experiences
Abstract
Our study introduces a new image-to-video generator called FashionFlow to generate fashion videos. By utilising a diffusion model, we are able to create short videos from still fashion images. Our approach involves developing and connecting relevant components with the diffusion model, which results in the creation of high-fidelity videos that are aligned with the conditional image. The components include the use of pseudo-3D convolutional layers to generate videos efficiently. VAE and CLIP encoders capture vital characteristics from still images to condition the diffusion model at a global level. Our research demonstrates a successful synthesis of fashion videos featuring models posing from various angles, showcasing the fit and appearance of the garment. Our findings hold great promise for improving and enhancing the shopping experience for the online fashion industry.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques
MethodsContrastive Language-Image Pre-training · Diffusion
