StoryBlender: Inter-Shot Consistent and Editable 3D Storyboard with Spatial-temporal Dynamics

Bingliang Li; Zhenhong Sun; Jiaming Bian; Yuehao Wu; Yifu Wang; Hongdong Li; Yatao Bian; Huadong Mo; Daoyi Dong

arXiv:2604.03315·cs.CV·April 7, 2026

StoryBlender: Inter-Shot Consistent and Editable 3D Storyboard with Spatial-temporal Dynamics

Bingliang Li, Zhenhong Sun, Jiaming Bian, Yuehao Wu, Yifu Wang, Hongdong Li, Yatao Bian, Huadong Mo, Daoyi Dong

PDF

1 Repo 1 Datasets

TL;DR

StoryBlender is a novel 3D storyboard generation framework that ensures inter-shot consistency and explicit editability by combining semantic grounding, unified asset instantiation, and dynamic layout design.

Contribution

It introduces a three-stage pipeline with a hierarchical agent system that self-corrects spatial hallucinations, enabling native 3D scenes with improved consistency and editability.

Findings

01

Significantly improves multi-shot consistency over baselines.

02

Supports direct, precise editing of cameras and assets.

03

Demonstrates effective long-horizon cinematic scene generation.

Abstract

Storyboarding is a core skill in visual storytelling for film, animation, and games. However, automating this process requires a system to achieve two properties that current approaches rarely satisfy simultaneously: inter-shot consistency and explicit editability. While 2D diffusion-based generators produce vivid imagery, they often suffer from identity drift along with limited geometric control; conversely, traditional 3D animation workflows are consistent and editable but require expert-heavy, labor-intensive authoring. We present StoryBlender, a grounded 3D storyboard generation framework governed by a Story-centric Reflection Scheme. At its core, we propose the StoryBlender system, which is built on a three-stage pipeline: (1) Semantic-Spatial Grounding, to construct a continuity memory graph to decouple global assets from shot-specific variables for long-horizon consistency; (2)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://engineeringai-lab.github.io/StoryBlender
github

Datasets

EngineeringAI-LAB/CineBoard3D
dataset· 146 dl
146 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.