MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation
Weijia Wu, Mingyu Liu, Zeyu Zhu, Xi Xia, Haoen Feng and, Wen Wang, Kevin Qinghong Lin, Chunhua Shen, Mike Zheng Shou

TL;DR
MovieBench is a new hierarchical dataset of long, coherent movies designed to facilitate research and development in long video generation models, addressing current limitations in scene complexity and character consistency.
Contribution
The paper introduces MovieBench, a comprehensive dataset with hierarchical structure, multi-scene narratives, and character consistency, specifically tailored for long video generation research.
Findings
MovieBench enables analysis of character consistency across scenes.
The dataset reveals challenges in maintaining narrative coherence.
Experiments show improved understanding of long video generation complexities.
Abstract
Recent advancements in video generation models, like Stable Video Diffusion, show promising results, but primarily focus on short, single-scene videos. These models struggle with generating long videos that involve multiple scenes, coherent narratives, and consistent characters. Furthermore, there is no publicly available dataset tailored for the analysis, evaluation, and training of long video generation models. In this paper, we present MovieBench: A Hierarchical Movie-Level Dataset for Long Video Generation, which addresses these challenges by providing unique contributions: (1) movie-length videos featuring rich, coherent storylines and multi-scene narratives, (2) consistency of character appearance and audio across scenes, and (3) hierarchical data structure contains high-level movie information and detailed shot-level descriptions. Experiments demonstrate that MovieBench brings…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCinema and Media Studies · Video Analysis and Summarization · Generative Adversarial Networks and Image Synthesis
MethodsDiffusion · Focus
