Learning Latent Super-Events to Detect Multiple Activities in Videos

AJ Piergiovanni; Michael S. Ryoo

arXiv:1712.01938·cs.CV·March 30, 2018

Learning Latent Super-Events to Detect Multiple Activities in Videos

AJ Piergiovanni, Michael S. Ryoo

PDF

2 Repos

TL;DR

This paper introduces a novel method for learning latent super-events in videos, which improves activity detection by capturing temporal relationships among multiple activities in unsegmented videos.

Contribution

It presents a new approach to learn latent super-events using temporal structure filters and attention mechanisms, enhancing activity detection in continuous videos.

Findings

01

Significantly improves activity detection accuracy.

02

Outperforms previous state-of-the-art methods.

03

Effective on multiple public datasets.

Abstract

In this paper, we introduce the concept of learning latent super-events from activity videos, and present how it benefits activity detection in continuous videos. We define a super-event as a set of multiple events occurring together in videos with a particular temporal organization; it is the opposite concept of sub-events. Real-world videos contain multiple activities and are rarely segmented (e.g., surveillance videos), and learning latent super-events allows the model to capture how the events are temporally related in videos. We design temporal structure filters that enable the model to focus on particular sub-intervals of the videos, and use them together with a soft attention mechanism to learn representations of latent super-events. Super-event representations are combined with per-frame or per-segment CNNs to provide frame-level annotations. Our approach is designed to be fully…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.