Story Generation from Visual Inputs: Techniques, Related Tasks, and Challenges
Daniel A. P. Oliveira, Eug\'enio Ribeiro, David Martins de Matos

TL;DR
This survey reviews methods for generating stories from visual data, discussing related tasks, datasets, and challenges, highlighting the field's current state and future directions.
Contribution
It provides a comprehensive overview of techniques, related tasks, datasets, and evaluation metrics in visual story generation, emphasizing their principles, strengths, and limitations.
Findings
Analyzes key datasets and evaluation metrics.
Identifies common challenges across related tasks.
Highlights limitations and future research directions.
Abstract
Creating engaging narratives from visual data is crucial for automated digital media consumption, assistive technologies, and interactive entertainment. This survey covers methodologies used in the generation of these narratives, focusing on their principles, strengths, and limitations. The survey also covers tasks related to automatic story generation, such as image and video captioning, and visual question answering, as well as story generation without visual inputs. These tasks share common challenges with visual story generation and have served as inspiration for the techniques used in the field. We analyze the main datasets and evaluation metrics, providing a critical perspective on their limitations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Storytelling and Education · Multimodal Machine Learning Applications · Video Analysis and Summarization
