Generic Event Boundary Detection: A Benchmark for Event Segmentation
Mike Zheng Shou, Stan Weixian Lei, Weiyao Wang, Deepti Ghadiyaram,, Matt Feiszli

TL;DR
This paper introduces a new task and benchmark for detecting generic, taxonomy-free event boundaries in videos, aiming to better understand video segmentation as perceived by humans.
Contribution
It defines the task of Generic Event Boundary Detection (GEBD), creates the Kinetics-GEBD benchmark with extensive annotations, and evaluates baseline approaches on this new dataset.
Findings
Annotations align with human perception of event boundaries.
Benchmark results show the challenge of GEBD task.
The dataset covers diverse, in-the-wild videos.
Abstract
This paper presents a novel task together with a new benchmark for detecting generic, taxonomy-free event boundaries that segment a whole video into chunks. Conventional work in temporal video segmentation and action detection focuses on localizing pre-defined action categories and thus does not scale to generic videos. Cognitive Science has known since last century that humans consistently segment videos into meaningful temporal chunks. This segmentation happens naturally, without pre-defined event categories and without being explicitly asked to do so. Here, we repeat these cognitive experiments on mainstream CV datasets; with our novel annotation guideline which addresses the complexities of taxonomy-free event boundary annotation, we introduce the task of Generic Event Boundary Detection (GEBD) and the new benchmark Kinetics-GEBD. Our Kinetics-GEBD has the largest number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
2nd International Workshop and Challenge on Long form Video Understanding· youtube
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications
