Winning the CVPR'2021 Kinetics-GEBD Challenge: Contrastive Learning   Approach

Hyolim Kang; Jinwoo Kim; Kyungmin Kim; Taehyun Kim; Seon Joo Kim

arXiv:2106.11549·cs.CV·June 23, 2021·5 cites

Winning the CVPR'2021 Kinetics-GEBD Challenge: Contrastive Learning Approach

Hyolim Kang, Jinwoo Kim, Kyungmin Kim, Taehyun Kim, Seon Joo Kim

PDF

Open Access 1 Repo

TL;DR

This paper presents a contrastive learning approach for generic event boundary detection in videos, leveraging temporal self-similarity matrices to identify natural event boundaries aligned with human perception, achieving top performance in a challenge.

Contribution

Introduces a novel contrastive learning method using temporal self-similarity matrices for GEBD, improving boundary detection accuracy over existing baselines.

Findings

01

Significant performance boost over baselines.

02

Effective use of temporal self-similarity matrices.

03

Achieved top results in CVPR 2021 Kinetics-GEBD Challenge.

Abstract

Generic Event Boundary Detection (GEBD) is a newly introduced task that aims to detect "general" event boundaries that correspond to natural human perception. In this paper, we introduce a novel contrastive learning based approach to deal with the GEBD. Our intuition is that the feature similarity of the video snippet would significantly vary near the event boundaries, while remaining relatively the same in the remaining part of the video. In our model, Temporal Self-similarity Matrix (TSM) is utilized as an intermediate representation which takes on a role as an information bottleneck. With our model, we achieved significant performance boost compared to the given baselines. Our code is available at https://github.com/hello-jinwoo/LOVEU-CVPR2021.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hello-jinwoo/LOVEU-CVPR2021
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications

MethodsContrastive Learning