A Benchmark and Multi-Agent System for Instruction-driven Cinematic Video Compilation

Peixuan Zhang; Chang Zhou; Ziyuan Zhang; Hualuo Liu; Chunjie Zhang; Jingqi Liu; Xiaohui Zhou; Xi Chen; Shuchen Weng; Si Li; Boxin Shi

arXiv:2604.10456·cs.CV·April 14, 2026

A Benchmark and Multi-Agent System for Instruction-driven Cinematic Video Compilation

Peixuan Zhang, Chang Zhou, Ziyuan Zhang, Hualuo Liu, Chunjie Zhang, Jingqi Liu, Xiaohui Zhou, Xi Chen, Shuchen Weng, Si Li, Boxin Shi

PDF

TL;DR

This paper introduces CineBench, a comprehensive benchmark for instruction-driven cinematic video compilation, and CineAgents, a multi-agent system that enhances narrative coherence in automated video editing.

Contribution

It presents the first benchmark for cinematic video compilation and a novel multi-agent system that improves narrative coherence through hierarchical memory and iterative planning.

Findings

01

CineAgents outperforms existing methods in coherence and logical consistency.

02

CineBench provides diverse instructions and high-quality annotations for evaluation.

03

The system demonstrates significant improvements in automated cinematic video compilation.

Abstract

The surging demand for adapting long-form cinematic content into short videos has motivated the need for versatile automatic video compilation systems. However, existing compilation methods are limited to predefined tasks, and the community lacks a comprehensive benchmark to evaluate the cinematic compilation. To address this, we introduce CineBench, the first benchmark for instruction-driven cinematic video compilation, featuring diverse user instructions and high-quality ground-truth compilations annotated by professional editors. To overcome contextual collapse and temporal fragmentation, we present CineAgents, a multi-agent system that reformulates cinematic video compilation into ``design-and-compose'' paradigm. CineAgents performs script reverse-engineering to construct a hierarchical narrative memory to provide multi-level context and employs an iterative narrative planning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.