STORYANCHORS: Generating Consistent Multi-Scene Story Frames for Long-Form Narratives
Bo Wang, Haoyang Huang, Zhiying Lu, Fengyuan Liu, Guoqing Ma, Jianlong Yuan, Yuan Zhang, Nan Duan, Daxin Jiang

TL;DR
StoryAnchors is a novel framework for generating multi-scene story frames with high temporal consistency, character continuity, and scene diversity, enabling more coherent and rich long-form narratives.
Contribution
It introduces a unified bidirectional story generator with novel training techniques for improved narrative coherence and scene diversity in story frame generation.
Findings
Outperforms existing models in consistency and narrative coherence.
Supports manual editing and expansion of story frames.
Achieves comparable performance to GPT-4o in story richness.
Abstract
This paper introduces StoryAnchors, a unified framework for generating high-quality, multi-scene story frames with strong temporal consistency. The framework employs a bidirectional story generator that integrates both past and future contexts to ensure temporal consistency, character continuity, and smooth scene transitions throughout the narrative. Specific conditions are introduced to distinguish story frame generation from standard video synthesis, facilitating greater scene diversity and enhancing narrative richness. To further improve generation quality, StoryAnchors integrates Multi-Event Story Frame Labeling and Progressive Story Frame Training, enabling the model to capture both overarching narrative flow and event-level dynamics. This approach supports the creation of editable and expandable story frames, allowing for manual modifications and the generation of longer, more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
