ContextualStory: Consistent Visual Storytelling with Spatially-Enhanced and Storyline Context
Sixiao Zheng, Yanwei Fu

TL;DR
ContextualStory is a new framework for visual storytelling that improves coherence and consistency in generated frames by using spatially-enhanced attention and storyline context, outperforming existing methods.
Contribution
It introduces Spatially-Enhanced Temporal Attention, a Storyline Contextualizer, and a StoryFlow Adapter to enhance visual storytelling coherence and efficiency.
Findings
Outperforms state-of-the-art methods on PororoSV and FlintstonesSV datasets.
Effectively captures spatial and temporal dependencies in storytelling.
Enhances scene consistency and character coherence in generated stories.
Abstract
Visual storytelling involves generating a sequence of coherent frames from a textual storyline while maintaining consistency in characters and scenes. Existing autoregressive methods, which rely on previous frame-sentence pairs, struggle with high memory usage, slow generation speeds, and limited context integration. To address these issues, we propose ContextualStory, a novel framework designed to generate coherent story frames and extend frames for visual storytelling. ContextualStory utilizes Spatially-Enhanced Temporal Attention to capture spatial and temporal dependencies, handling significant character movements effectively. Additionally, we introduce a Storyline Contextualizer to enrich context in storyline embedding, and a StoryFlow Adapter to measure scene changes between frames for guiding the model. Extensive experiments on PororoSV and FlintstonesSV datasets demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Data Visualization and Analytics · Artificial Intelligence in Games
MethodsSoftmax · Attention Is All You Need · Adapter
