Synopses of Movie Narratives: a Video-Language Dataset for Story Understanding
Yidan Sun, Qin Chao, Yangfeng Ji, Boyang Li

TL;DR
This paper introduces SyMoN, a large video-language dataset of movie and TV series summaries designed to advance AI story understanding through challenging multimodal and narrative features.
Contribution
The paper presents SyMoN, a new extensive dataset of naturalistic storytelling videos with rich multimodal and mental-state annotations, enabling improved story understanding models.
Findings
Benchmarks on video-text retrieval demonstrate the dataset's utility.
Zero-shot alignment results highlight the importance of in-domain data.
Long-term memory is crucial for effective story comprehension.
Abstract
Despite recent advances of AI, story understanding remains an open and under-investigated problem. We collect, preprocess, and publicly release a video-language story dataset, Synopses of Movie Narratives (SyMoN), containing 5,193 video summaries of popular movies and TV series with a total length of 869 hours. SyMoN captures naturalistic storytelling videos made by human creators and intended for a human audience. As a prototypical and naturalistic story dataset, SyMoN features high coverage of multimodal story events and abundant mental-state descriptions. Its use of storytelling techniques cause cross-domain semantic gaps that provide appropriate challenges to existing models. We establish benchmarks on video-text retrieval and zero-shot alignment on movie summary videos, which showcase the importance of in-domain data and long-term memory in story understanding. With SyMoN, we hope…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Digital Storytelling and Education
