Cut2Next: Generating Next Shot via In-Context Tuning

Jingwen He; Hongbo Liu; Jiajun Li; Ziqi Huang; Yu Qiao; Wanli Ouyang; Ziwei Liu

arXiv:2508.08244·cs.CV·August 13, 2025

Cut2Next: Generating Next Shot via In-Context Tuning

Jingwen He, Hongbo Liu, Jiajun Li, Ziqi Huang, Yu Qiao, Wanli Ouyang, Ziwei Liu

PDF

TL;DR

Cut2Next introduces a diffusion transformer-based framework that generates next shots in film sequences by adhering to professional editing patterns and cinematic continuity, enhancing narrative flow and visual coherence.

Contribution

The paper presents a novel in-context tuning approach with hierarchical prompts and architectural innovations for cinematic shot generation, addressing limitations of existing methods.

Findings

01

Outperforms baselines in visual consistency and text fidelity

02

User studies favor Cut2Next for editing pattern adherence

03

Effective in maintaining cinematic continuity

Abstract

Effective multi-shot generation demands purposeful, film-like transitions and strict cinematic continuity. Current methods, however, often prioritize basic visual consistency, neglecting crucial editing patterns (e.g., shot/reverse shot, cutaways) that drive narrative flow for compelling storytelling. This yields outputs that may be visually coherent but lack narrative sophistication and true cinematic integrity. To bridge this, we introduce Next Shot Generation (NSG): synthesizing a subsequent, high-quality shot that critically conforms to professional editing patterns while upholding rigorous cinematic continuity. Our framework, Cut2Next, leverages a Diffusion Transformer (DiT). It employs in-context tuning guided by a novel Hierarchical Multi-Prompting strategy. This strategy uses Relational Prompts to define overall context and inter-shot editing styles. Individual Prompts then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.