S2ED: From Story to Executable Descriptions for Consistency-Aware Story Illustration

Sijing Yin; Jiamou Liu; Xiao Tang; Yaser Shakib; Qian Liu

arXiv:2605.22448·cs.AI·May 22, 2026

S2ED: From Story to Executable Descriptions for Consistency-Aware Story Illustration

Sijing Yin, Jiamou Liu, Xiao Tang, Yaser Shakib, Qian Liu

PDF

TL;DR

S2ED is a prompt-layer framework that converts stories into explicit descriptions to improve multi-frame story illustration consistency and character fidelity without retraining models.

Contribution

It introduces a training-free, model-agnostic method that enhances story illustration coherence through explicit, editable descriptions and coordinated agent prompts.

Findings

01

S2ED outperforms strong prompting and training-based methods in consistency and fidelity.

02

It improves sequence-level coherence and character identity in story illustrations.

03

S2ED enables local edits to repair drift without retraining.

Abstract

Multi-frame story illustration requires long-horizon coherence beyond single-image text-to-image generation, including narrative decomposition and persistent character identity, layout, and affect across frames. We propose Story-to-Executable Descriptions (S2ED), a training-free, model-agnostic, prompt-layer framework that converts a full story into a sequence of explicit, editable executable descriptions for more consistent rendering. S2ED coordinates three agents to segment the narrative, ground canonical character attributes, and enrich spatial and affective cues, enabling interpretable prompt-carried state propagation and local edits to repair drift without retraining the generator. Experiments on Flintstones and Shakoo Maku show that S2ED improves sequence-level consistency and character fidelity over strong prompting, large-model planning, and a reference training-based method,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.