Video Diffusion Models: A Survey
Andrew Melnik, Michal Ljubljanac, Cong Lu, Qi Yan, Weiming Ren, Helge, Ritter

TL;DR
This survey comprehensively reviews diffusion models for video generation, covering principles, architectures, applications, and challenges, highlighting recent advancements and future research directions in the field.
Contribution
It provides a detailed taxonomy of video diffusion models, summarizes recent progress in text-to-video synthesis, and discusses current challenges and future prospects.
Findings
Advancements in text-to-video generation capabilities.
Identification of key challenges like longer video synthesis and computational costs.
Summary of evaluation metrics and datasets used in the field.
Abstract
Diffusion generative models have recently become a powerful technique for creating and modifying high-quality, coherent video content. This survey provides a comprehensive overview of the critical components of diffusion models for video generation, including their applications, architectural design, and temporal dynamics modeling. The paper begins by discussing the core principles and mathematical formulations, then explores various architectural choices and methods for maintaining temporal consistency. A taxonomy of applications is presented, categorizing models based on input modalities such as text prompts, images, videos, and audio signals. Advancements in text-to-video generation are discussed to illustrate the state-of-the-art capabilities and limitations of current approaches. Additionally, the survey summarizes recent developments in training and evaluation practices, including…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Coding and Compression Technologies · Advanced Data Compression Techniques
MethodsDiffusion
