Knowledge-enriched Attention Network with Group-wise Semantic for Visual Storytelling
Tengpeng Li, Hanli Wang, Bin He, Chang Wen Chen

TL;DR
This paper introduces a novel end-to-end model for visual storytelling that leverages external knowledge and group-wise semantics to generate more coherent and imaginative stories from images.
Contribution
It proposes a knowledge-enriched attention network combined with a group-wise semantic module within a unified encoder-decoder framework for improved storytelling.
Findings
Outperforms state-of-the-art methods on Visual Storytelling dataset
Enhances story coherence with implicit external knowledge integration
Improves narrative quality through group-wise semantic modeling
Abstract
As a technically challenging topic, visual storytelling aims at generating an imaginary and coherent story with narrative multi-sentences from a group of relevant images. Existing methods often generate direct and rigid descriptions of apparent image-based contents, because they are not capable of exploring implicit information beyond images. Hence, these schemes could not capture consistent dependencies from holistic representation, impairing the generation of reasonable and fluent story. To address these problems, a novel knowledge-enriched attention network with group-wise semantic model is proposed. Three main novel components are designed and supported by substantial experiments to reveal practical advantages. First, a knowledge-enriched attention network is designed to extract implicit concepts from external knowledge system, and these concepts are followed by a cascade…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Video Analysis and Summarization
