SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency

Quanjian Song; Donghao Zhou; Jingyu Lin; Fei Shen; Jiaze Wang; Xiaowei Hu; Cunjian Chen; Pheng-Ann Heng

arXiv:2510.22994·cs.CV·October 28, 2025

SceneDecorator: Towards Scene-Oriented Story Generation with Scene Planning and Scene Consistency

Quanjian Song, Donghao Zhou, Jingyu Lin, Fei Shen, Jiaze Wang, Xiaowei Hu, Cunjian Chen, Pheng-Ann Heng

PDF

TL;DR

SceneDecorator introduces a novel framework for scene-oriented story generation that enhances narrative coherence and scene consistency across stories using vision-language models and attention mechanisms.

Contribution

It presents a training-free approach combining scene planning and long-term scene sharing to improve scene coherence and consistency in story generation.

Findings

01

Outperforms existing methods in scene coherence and consistency

02

Enhances creativity in arts, films, and games

03

Demonstrates effectiveness through extensive experiments

Abstract

Recent text-to-image models have revolutionized image generation, but they still struggle with maintaining concept consistency across generated images. While existing works focus on character consistency, they often overlook the crucial role of scenes in storytelling, which restricts their creativity in practice. This paper introduces scene-oriented story generation, addressing two key challenges: (i) scene planning, where current methods fail to ensure scene-level narrative coherence by relying solely on text descriptions, and (ii) scene consistency, which remains largely unexplored in terms of maintaining scene consistency across multiple stories. We propose SceneDecorator, a training-free framework that employs VLM-Guided Scene Planning to ensure narrative coherence across different scenes in a ``global-to-local'' manner, and Long-Term Scene-Sharing Attention to maintain long-term…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.