Story2Board: A Training-Free Approach for Expressive Storyboard Generation

David Dinkevich; Matan Levy; Omri Avrahami; Dvir Samuel; and Dani Lischinski

arXiv:2508.09983·cs.CV·August 14, 2025

Story2Board: A Training-Free Approach for Expressive Storyboard Generation

David Dinkevich, Matan Levy, Omri Avrahami, Dvir Samuel, and Dani Lischinski

PDF

Open Access

TL;DR

Story2Board is a training-free framework that enhances visual storytelling by generating coherent, diverse, and engaging storyboards from natural language, using novel consistency mechanisms without model fine-tuning.

Contribution

It introduces a training-free, lightweight consistency framework with novel attention mechanisms for improved storyboard coherence and diversity, along with a new benchmark and metrics for evaluation.

Findings

01

Produces more coherent and diverse storyboards than baselines.

02

Enhances visual storytelling without model fine-tuning.

03

Outperforms existing methods in user studies and quantitative metrics.

Abstract

We present Story2Board, a training-free framework for expressive storyboard generation from natural language. Existing methods narrowly focus on subject identity, overlooking key aspects of visual storytelling such as spatial composition, background evolution, and narrative pacing. To address this, we introduce a lightweight consistency framework composed of two components: Latent Panel Anchoring, which preserves a shared character reference across panels, and Reciprocal Attention Value Mixing, which softly blends visual features between token pairs with strong reciprocal attention. Together, these mechanisms enhance coherence without architectural changes or fine-tuning, enabling state-of-the-art diffusion models to generate visually diverse yet consistent storyboards. To structure generation, we use an off-the-shelf language model to convert free-form stories into grounded panel-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Artificial Intelligence in Games · Generative Adversarial Networks and Image Synthesis