Aether Weaver: Multimodal Affective Narrative Co-Generation with Dynamic Scene Graphs
Saeed Ghorbani

TL;DR
Aether Weaver is an integrated multimodal narrative co-generation system that simultaneously creates text, visuals, scene graphs, and soundscapes, ensuring consistency and emotional resonance for immersive storytelling.
Contribution
It introduces a novel framework that concurrently generates multimodal narrative components with dynamic scene graphs and emotional coherence, surpassing traditional sequential pipelines.
Findings
Enhances narrative depth and visual fidelity.
Improves emotional resonance across modalities.
Outperforms baseline approaches in qualitative evaluations.
Abstract
We introduce Aether Weaver, a novel, integrated framework for multimodal narrative co-generation that overcomes limitations of sequential text-to-visual pipelines. Our system concurrently synthesizes textual narratives, dynamic scene graph representations, visual scenes, and affective soundscapes, driven by a tightly integrated, co-generation mechanism. At its core, the Narrator, a large language model, generates narrative text and multimodal prompts, while the Director acts as a dynamic scene graph manager, and analyzes the text to build and maintain a structured representation of the story's world, ensuring spatio-temporal and relational consistency for visual rendering and subsequent narrative generation. Additionally, a Narrative Arc Controller guides the high-level story structure, influencing multimodal affective consistency, further complemented by an Affective Tone Mapper that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
