Hierarchical Knowledge Graphs for Story Understanding in Visual Narratives
Yi-Chun Chen

TL;DR
This paper introduces a hierarchical knowledge graph framework for understanding visual narratives, especially comics, by organizing semantic, spatial, and temporal information across multiple levels to support interpretable reasoning.
Contribution
It presents a novel hierarchical knowledge graph approach that captures multi-level narrative structure and semantics in visual storytelling, emphasizing transparency over predictive accuracy.
Findings
Supports tasks like action retrieval and timeline reconstruction
Enables interpretable symbolic reasoning in visual narratives
Provides a foundation for explainable narrative analysis
Abstract
We present a hierarchical knowledge graph framework for the structured semantic understanding of visual narratives, using comics as a representative domain for multimodal storytelling. The framework organizes narrative content across three levels-panel, event, and macro-event, by integrating symbolic graphs that encode semantic, spatial, and temporal relationships. At the panel level, it models visual elements such as characters, objects, and actions alongside textual components including dialogue and narration. These are systematically connected to higher-level graphs that capture narrative sequences and abstract story structures. Applied to a manually annotated subset of the Manga109 dataset, the framework supports interpretable symbolic reasoning across four representative tasks: action retrieval, dialogue tracing, character appearance mapping, and timeline reconstruction. Rather…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Artificial Intelligence in Games · Topic Modeling
