Generic Event Boundary Detection via Denoising Diffusion
Jaejun Hwang, Dayoung Gong, Manjin Kim, Minsu Cho

TL;DR
This paper presents DiffGEBD, a generative diffusion model for event boundary detection in videos, which captures diverse plausible boundaries and outperforms previous deterministic methods on standard benchmarks.
Contribution
The paper introduces a novel diffusion-based approach for GEBD that models boundary diversity and proposes a new evaluation metric for assessing prediction quality.
Findings
Achieves strong performance on Kinetics-GEBD and TAPOS benchmarks.
Generates diverse and plausible event boundaries.
Controls diversity via classifier-free guidance.
Abstract
Generic event boundary detection (GEBD) aims to identify natural boundaries in a video, segmenting it into distinct and meaningful chunks. Despite the inherent subjectivity of event boundaries, previous methods have focused on deterministic predictions, overlooking the diversity of plausible solutions. In this paper, we introduce a novel diffusion-based boundary detection model, dubbed DiffGEBD, that tackles the problem of GEBD from a generative perspective. The proposed model encodes relevant changes across adjacent frames via temporal self-similarity and then iteratively decodes random noise into plausible event boundaries being conditioned on the encoded features. Classifier-free guidance allows the degree of diversity to be controlled in denoising diffusion. In addition, we introduce a new evaluation metric to assess the quality of predictions considering both diversity and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications
