Generic Event Boundary Detection: A Benchmark for Event Segmentation

Mike Zheng Shou; Stan Weixian Lei; Weiyao Wang; Deepti Ghadiyaram,; Matt Feiszli

arXiv:2101.10511·cs.CV·August 20, 2021·1 cites

Generic Event Boundary Detection: A Benchmark for Event Segmentation

Mike Zheng Shou, Stan Weixian Lei, Weiyao Wang, Deepti Ghadiyaram,, Matt Feiszli

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces a new task and benchmark for detecting generic, taxonomy-free event boundaries in videos, aiming to better understand video segmentation as perceived by humans.

Contribution

It defines the task of Generic Event Boundary Detection (GEBD), creates the Kinetics-GEBD benchmark with extensive annotations, and evaluates baseline approaches on this new dataset.

Findings

01

Annotations align with human perception of event boundaries.

02

Benchmark results show the challenge of GEBD task.

03

The dataset covers diverse, in-the-wild videos.

Abstract

This paper presents a novel task together with a new benchmark for detecting generic, taxonomy-free event boundaries that segment a whole video into chunks. Conventional work in temporal video segmentation and action detection focuses on localizing pre-defined action categories and thus does not scale to generic videos. Cognitive Science has known since last century that humans consistently segment videos into meaningful temporal chunks. This segmentation happens naturally, without pre-defined event categories and without being explicitly asked to do so. Here, we repeat these cognitive experiments on mainstream CV datasets; with our novel annotation guideline which addresses the complexities of taxonomy-free event boundary annotation, we introduce the task of Generic Event Boundary Detection (GEBD) and the new benchmark Kinetics-GEBD. Our Kinetics-GEBD has the largest number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

2nd International Workshop and Challenge on Long form Video Understanding· youtube

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications