Generating Storytelling Images with Rich Chains-of-Reasoning
Xiujie Song, Qi Jia, Shota Watanabe, Xiaoyi Pang, Ruijie Chen, Mengyue Wu, Kenny Q. Zhu

TL;DR
This paper introduces a novel method for generating storytelling images that incorporate rich, multi-layered reasoning chains, enabling more compelling visual narratives for various applications.
Contribution
It proposes the StorytellingPainter pipeline combining LLM reasoning with T2I synthesis and introduces Mini-Storytellers to improve story generation quality.
Findings
The approach produces images with high semantic complexity and diversity.
The evaluation framework effectively measures text-image alignment.
Experimental results confirm the feasibility of the proposed methods.
Abstract
A single image can convey a compelling story through logically connected visual clues, forming Chains-of-Reasoning (CoRs). We define these semantically rich images as Storytelling Images. By conveying multi-layered information that inspires active interpretation, these images enable a wide range of applications, such as illustration and cognitive screening. Despite their potential, such images are scarce and complex to create. To address this, we introduce the Storytelling Image Generation task and propose StorytellingPainter, a two-stage pipeline combining the reasoning of Large Language Models (LLMs) with Text-to-Image (T2I) synthesis. We also develop a dedicated evaluation framework assessing semantic complexity, diversity, and text-image alignment. Furthermore, given the critical role of story generation in the task, we introduce lightweight Mini-Storytellers to bridge the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games
