Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline
Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo, Zheng, Qin Jin

TL;DR
This paper introduces Synchronized Video Storytelling, a new task for generating coherent, synchronized narrations for video clips guided by structured storylines, supported by a new dataset and a novel framework.
Contribution
The paper proposes a new task, introduces a benchmark dataset, and develops the VideoNarrator framework for synchronized video narration guided by structured storylines.
Findings
The proposed framework effectively generates coherent narrations.
The new dataset supports research in synchronized video storytelling.
Evaluation shows the approach outperforms existing methods.
Abstract
Video storytelling is engaging multimedia content that utilizes video and its accompanying narration to attract the audience, where a key challenge is creating narrations for recorded visual scenes. Previous studies on dense video captioning and video story generation have made some progress. However, in practical applications, we typically require synchronized narrations for ongoing visual scenes. In this work, we introduce a new task of Synchronized Video Storytelling, which aims to generate synchronous and informative narrations for videos. These narrations, associated with each video clip, should relate to the visual content, integrate relevant knowledge, and have an appropriate word count corresponding to the clip's duration. Specifically, a structured storyline is beneficial to guide the generation process, ensuring coherence and integrity. To support the exploration of this task,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsVideo Analysis and Summarization · Digital Storytelling and Education · Multimedia Communication and Technology
