Video Diffusion Models: A Survey

Andrew Melnik; Michal Ljubljanac; Cong Lu; Qi Yan; Weiming Ren; Helge; Ritter

arXiv:2405.03150·cs.CV·November 19, 2024·1 cites

Video Diffusion Models: A Survey

Andrew Melnik, Michal Ljubljanac, Cong Lu, Qi Yan, Weiming Ren, Helge, Ritter

PDF

Open Access 1 Repo

TL;DR

This survey comprehensively reviews diffusion models for video generation, covering principles, architectures, applications, and challenges, highlighting recent advancements and future research directions in the field.

Contribution

It provides a detailed taxonomy of video diffusion models, summarizes recent progress in text-to-video synthesis, and discusses current challenges and future prospects.

Findings

01

Advancements in text-to-video generation capabilities.

02

Identification of key challenges like longer video synthesis and computational costs.

03

Summary of evaluation metrics and datasets used in the field.

Abstract

Diffusion generative models have recently become a powerful technique for creating and modifying high-quality, coherent video content. This survey provides a comprehensive overview of the critical components of diffusion models for video generation, including their applications, architectural design, and temporal dynamics modeling. The paper begins by discussing the core principles and mathematical formulations, then explores various architectural choices and methods for maintaining temporal consistency. A taxonomy of applications is presented, categorizing models based on input modalities such as text prompts, images, videos, and audio signals. Advancements in text-to-video generation are discussed to illustrate the state-of-the-art capabilities and limitations of current approaches. Additionally, the survey summarizes recent developments in training and evaluation practices, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ndrwmlnk/awesome-video-diffusion-models
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Coding and Compression Technologies · Advanced Data Compression Techniques

MethodsDiffusion