VidEvent: A Large Dataset for Understanding Dynamic Evolution of Events in Videos
Baoyu Liang, Qile Su, Shoutai Zhu, Yuchen Liang, Chao Tong

TL;DR
VidEvent introduces a large, richly annotated dataset of over 23,000 events in videos, enabling advancements in understanding complex event structures and evolution in video analysis.
Contribution
The paper presents VidEvent, a comprehensive dataset with detailed event annotations and baseline models, supporting research in dynamic video event understanding.
Findings
VidEvent contains over 23,000 well-labeled events.
Baseline models demonstrate the dataset's utility for event understanding.
The dataset facilitates future algorithm development.
Abstract
Despite the significant impact of visual events on human cognition, understanding events in videos remains a challenging task for AI due to their complex structures, semantic hierarchies, and dynamic evolution. To address this, we propose the task of video event understanding that extracts event scripts and makes predictions with these scripts from videos. To support this task, we introduce VidEvent, a large-scale dataset containing over 23,000 well-labeled events, featuring detailed event structures, broad hierarchies, and logical relations extracted from movie recap videos. The dataset was created through a meticulous annotation process, ensuring high-quality and reliable event data. We also provide comprehensive baseline models offering detailed descriptions of their architecture and performance metrics. These models serve as benchmarks for future research, facilitating comparisons…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting
