Generating Visual Stories with Grounded and Coreferent Characters

Danyang Liu; Mirella Lapata; Frank Keller

arXiv:2409.13555·cs.CL·March 4, 2025

Generating Visual Stories with Grounded and Coreferent Characters

Danyang Liu, Mirella Lapata, Frank Keller

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel task of character-centric visual story generation, developing a model that produces more coherent stories with consistent and grounded character mentions, enhancing narrative richness.

Contribution

The paper presents the first model for character-grounded visual story generation, along with a new dataset enriched with character coreference and novel evaluation metrics.

Findings

01

Model generates more consistent and coreferent character mentions.

02

Enriched dataset improves character grounding in stories.

03

Proposed metrics effectively measure character richness and coreference.

Abstract

Characters are important in narratives. They move the plot forward, create emotional connections, and embody the story's themes. Visual storytelling methods focus more on the plot and events relating to it, without building the narrative around specific characters. As a result, the generated stories feel generic, with character mentions being absent, vague, or incorrect. To mitigate these issues, we introduce the new task of character-centric story generation and present the first model capable of predicting visual stories with consistently grounded and coreferent character mentions. Our model is finetuned on a new dataset which we build on top of the widely used VIST benchmark. Specifically, we develop an automated pipeline to enrich VIST with visual and textual character coreference chains. We also propose new evaluation metrics to measure the richness of characters and coreference in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Generating Visual Stories with Grounded and Coreferent Characters· underline

Taxonomy

TopicsDigital Storytelling and Education · Persona Design and Applications

MethodsFocus