DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior

Junjia Huang; Binbin Yang; Pengxiang Yan; Jiyang Liu; Bin Xia; Zhao Wang; Yitong Wang; Liang Lin; Guanbin Li

arXiv:2604.17195·cs.CV·April 21, 2026

DreamShot: Personalized Storyboard Synthesis with Video Diffusion Prior

Junjia Huang, Binbin Yang, Pengxiang Yan, Jiyang Liu, Bin Xia, Zhao Wang, Yitong Wang, Liang Lin, Guanbin Li

PDF

TL;DR

DreamShot is a novel video diffusion-based storyboard synthesis framework that generates coherent, narrative-driven shot sequences with consistent characters and scenes, supporting flexible text and reference inputs.

Contribution

It introduces a controllable multi-shot storyboard generation method leveraging video diffusion priors and a role-conditioning module for character identity consistency.

Findings

01

Outperforms state-of-the-art models in scene coherence and role consistency.

02

Supports both text-to-shot and reference-to-shot generation.

03

Produces visually and semantically coherent story sequences.

Abstract

Storyboard synthesis plays a crucial role in visual storytelling, aiming to generate coherent shot sequences that visually narrate cinematic events with consistent characters, scenes, and transitions. However, existing approaches are mostly adapted from text-to-image diffusion models, which struggle to maintain long-range temporal coherence, consistent character identities, and narrative flow across multiple shots. In this paper, we introduce DreamShot, a video generative model based storyboard framework that fully exploits powerful video diffusion priors for controllable multi-shot synthesis. DreamShot supports both Text-to-Shot and Reference-to-Shot generation, as well as story continuation conditioned on previous frames, enabling flexible and context-aware storyboard generation. By leveraging the spatial-temporal consistency inherent in video generative models, DreamShot produces…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.