Generic Event Boundary Detection in Video with Pyramid Features

Van Thong Huynh; Hyung-Jeong Yang; Guee-Sang Lee; Soo-Hyung Kim

arXiv:2301.04288·cs.CV·January 12, 2023

Generic Event Boundary Detection in Video with Pyramid Features

Van Thong Huynh, Hyung-Jeong Yang, Guee-Sang Lee, Soo-Hyung Kim

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel method for generic event boundary detection in videos using pyramid features that analyze frame similarities across spatial and temporal dimensions, outperforming existing approaches.

Contribution

The study proposes a new framework leveraging pyramid feature maps and a similarity-based decoding process for improved event boundary detection in videos.

Findings

01

Outperforms state-of-the-art on GEBD benchmark

02

Effective on long-form Olympic sport videos

03

Utilizes multi-scale spatial-temporal features

Abstract

Generic event boundary detection (GEBD) aims to split video into chunks at a broad and diverse set of actions as humans naturally perceive event boundaries. In this study, we present an approach that considers the correlation between neighbor frames with pyramid feature maps in both spatial and temporal dimensions to construct a framework for localizing generic events in video. The features at multiple spatial dimensions of a pre-trained ResNet-50 are exploited with different views in the temporal dimension to form a temporal pyramid feature map. Based on that, the similarity between neighbor frames is calculated and projected to build a temporal pyramid similarity feature vector. A decoder with 1D convolution operations is used to decode these similarities to a new representation that incorporates their temporal relationship for later boundary score estimation. Extensive experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

th2l/GEBD-PyramidFeatureSimilarity
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Analysis and Summarization

MethodsConvolution