Latent Video Diffusion Models for High-Fidelity Long Video Generation

Yingqing He; Tianyu Yang; Yong Zhang; Ying Shan; Qifeng Chen

arXiv:2211.13221·cs.CV·March 21, 2023·26 cites

Latent Video Diffusion Models for High-Fidelity Long Video Generation

Yingqing He, Tianyu Yang, Yong Zhang, Ying Shan, Qifeng Chen

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces lightweight hierarchical latent diffusion models for high-fidelity, long video generation, significantly improving quality and length over previous methods while maintaining computational efficiency.

Contribution

The paper proposes a novel hierarchical latent diffusion framework with conditional latent perturbation and guidance, enabling realistic, long videos with reduced computational costs.

Findings

01

Outperforms previous pixel-space diffusion models in quality and length

02

Enables generation of videos with over a thousand frames

03

Demonstrates superior results in small domain and text-to-video tasks

Abstract

AI-generated content has attracted lots of attention recently, but photo-realistic video synthesis is still challenging. Although many attempts using GANs and autoregressive models have been made in this area, the visual quality and length of generated videos are far from satisfactory. Diffusion models have shown remarkable results recently but require significant computational resources. To address this, we introduce lightweight video diffusion models by leveraging a low-dimensional 3D latent space, significantly outperforming previous pixel-space video diffusion models under a limited computational budget. In addition, we propose hierarchical diffusion in the latent space such that longer videos with more than one thousand frames can be produced. To further overcome the performance degradation issue for long video generation, we propose conditional latent perturbation and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yingqinghe/lvdm
pytorchOfficial

Models

🤗
ReySajju742/VideoCrafter
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis

MethodsDiffusion