Story2Board: A Training-Free Approach for Expressive Storyboard Generation
David Dinkevich, Matan Levy, Omri Avrahami, Dvir Samuel, and Dani Lischinski

TL;DR
Story2Board is a training-free framework that enhances visual storytelling by generating coherent, diverse, and engaging storyboards from natural language, using novel consistency mechanisms without model fine-tuning.
Contribution
It introduces a training-free, lightweight consistency framework with novel attention mechanisms for improved storyboard coherence and diversity, along with a new benchmark and metrics for evaluation.
Findings
Produces more coherent and diverse storyboards than baselines.
Enhances visual storytelling without model fine-tuning.
Outperforms existing methods in user studies and quantitative metrics.
Abstract
We present Story2Board, a training-free framework for expressive storyboard generation from natural language. Existing methods narrowly focus on subject identity, overlooking key aspects of visual storytelling such as spatial composition, background evolution, and narrative pacing. To address this, we introduce a lightweight consistency framework composed of two components: Latent Panel Anchoring, which preserves a shared character reference across panels, and Reciprocal Attention Value Mixing, which softly blends visual features between token pairs with strong reciprocal attention. Together, these mechanisms enhance coherence without architectural changes or fine-tuning, enabling state-of-the-art diffusion models to generate visually diverse yet consistent storyboards. To structure generation, we use an off-the-shelf language model to convert free-form stories into grounded panel-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Artificial Intelligence in Games · Generative Adversarial Networks and Image Synthesis
