Generating Visual Stories with Grounded and Coreferent Characters
Danyang Liu, Mirella Lapata, Frank Keller

TL;DR
This paper introduces a novel task of character-centric visual story generation, developing a model that produces more coherent stories with consistent and grounded character mentions, enhancing narrative richness.
Contribution
The paper presents the first model for character-grounded visual story generation, along with a new dataset enriched with character coreference and novel evaluation metrics.
Findings
Model generates more consistent and coreferent character mentions.
Enriched dataset improves character grounding in stories.
Proposed metrics effectively measure character richness and coreference.
Abstract
Characters are important in narratives. They move the plot forward, create emotional connections, and embody the story's themes. Visual storytelling methods focus more on the plot and events relating to it, without building the narrative around specific characters. As a result, the generated stories feel generic, with character mentions being absent, vague, or incorrect. To mitigate these issues, we introduce the new task of character-centric story generation and present the first model capable of predicting visual stories with consistently grounded and coreferent character mentions. Our model is finetuned on a new dataset which we build on top of the widely used VIST benchmark. Specifically, we develop an automated pipeline to enrich VIST with visual and textual character coreference chains. We also propose new evaluation metrics to measure the richness of characters and coreference in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDigital Storytelling and Education · Persona Design and Applications
MethodsFocus
