FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video   Synthesis from Static Imagery

Tasin Islam; Alina Miron; XiaoHui Liu; Yongmin Li

arXiv:2310.00106·cs.CV·January 23, 2024·2 cites

FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video Synthesis from Static Imagery

Tasin Islam, Alina Miron, XiaoHui Liu, Yongmin Li

PDF

Open Access 1 Repo

TL;DR

FashionFlow is a novel diffusion-based method that converts static fashion images into high-quality, dynamic videos, enhancing online shopping experiences by showcasing garments from multiple angles.

Contribution

The paper introduces FashionFlow, a diffusion model-based framework that synthesizes realistic fashion videos from still images using pseudo-3D convolutions and encoder features.

Findings

01

High-fidelity fashion video generation from images

02

Effective use of VAE and CLIP encoders for conditioning

03

Potential to improve online fashion retail experiences

Abstract

Our study introduces a new image-to-video generator called FashionFlow to generate fashion videos. By utilising a diffusion model, we are able to create short videos from still fashion images. Our approach involves developing and connecting relevant components with the diffusion model, which results in the creation of high-fidelity videos that are aligned with the conditional image. The components include the use of pseudo-3D convolutional layers to generate videos efficiently. VAE and CLIP encoders capture vital characteristics from still images to condition the diffusion model at a global level. Our research demonstrates a successful synthesis of fashion videos featuring models posing from various angles, showcasing the fit and appearance of the garment. Our findings hold great promise for improving and enhancing the shopping experience for the online fashion industry.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

1702609/fashionflow
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Computer Graphics and Visualization Techniques

MethodsContrastive Language-Image Pre-training · Diffusion