Video Storytelling: Textual Summaries for Events
Junnan Li, Yongkang Wong, Qi Zhao, Mohan S. Kankanhalli

TL;DR
This paper introduces the novel task of video storytelling, generating coherent stories for long videos by leveraging a context-aware neural framework and a reinforcement learning-based narrator, advancing multimedia understanding.
Contribution
It proposes a new framework combining a Residual Bidirectional RNN and a reinforcement learning narrator for effective video storytelling, addressing challenges of diversity and complexity.
Findings
Outperforms state-of-the-art baselines in quantitative metrics
Achieves higher user satisfaction in user studies
Introduces the Video Story dataset for future research
Abstract
Bridging vision and natural language is a longstanding goal in computer vision and multimedia research. While earlier works focus on generating a single-sentence description for visual content, recent works have studied paragraph generation. In this work, we introduce the problem of video storytelling, which aims at generating coherent and succinct stories for long videos. Video storytelling introduces new challenges, mainly due to the diversity of the story and the length and complexity of the video. We propose novel methods to address the challenges. First, we propose a context-aware framework for multimodal embedding learning, where we design a Residual Bidirectional Recurrent Neural Network to leverage contextual information from past and future. Second, we propose a Narrator model to discover the underlying storyline. The Narrator is formulated as a reinforcement learning agent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Multimodal Machine Learning Applications · Digital Storytelling and Education
