Video Is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation

Faraz Waseem; Muhammad Shahzad

arXiv:2412.18688·cs.CV·August 6, 2025

Video Is Worth a Thousand Images: Exploring the Latest Trends in Long Video Generation

Faraz Waseem, Muhammad Shahzad

PDF

Open Access

TL;DR

This survey reviews recent advances in long video generation, highlighting challenges, techniques like GANs and diffusion models, and future research directions to improve scalability and quality.

Contribution

It provides a comprehensive overview of current methods, datasets, metrics, and challenges in long video generation, guiding future research in the field.

Findings

01

Current systems are limited to short videos up to one minute.

02

Integrating AI with divide-and-conquer strategies can enhance scalability.

03

Identifies key challenges and future research directions in long video generation.

Abstract

An image may convey a thousand words, but a video composed of hundreds or thousands of image frames tells a more intricate story. Despite significant progress in multimodal large language models (MLLMs), generating extended videos remains a formidable challenge. As of this writing, OpenAI's Sora, the current state-of-the-art system, is still limited to producing videos that are up to one minute in length. This limitation stems from the complexity of long video generation, which requires more than generative AI techniques for approximating density functions essential aspects such as planning, story development, and maintaining spatial and temporal consistency present additional hurdles. Integrating generative AI with a divide-and-conquer approach could improve scalability for longer videos while offering greater control. In this survey, we examine the current landscape of long video…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Games and Media · Cinema and Media Studies

MethodsDiffusion