Every Moment Counts: Dense Detailed Labeling of Actions in Complex   Videos

Serena Yeung; Olga Russakovsky; Ning Jin; Mykhaylo Andriluka; Greg; Mori; Li Fei-Fei

arXiv:1507.05738·cs.CV·June 12, 2017·84 cites

Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos

Serena Yeung, Olga Russakovsky, Ning Jin, Mykhaylo Andriluka, Greg, Mori, Li Fei-Fei

PDF

Open Access 1 Repo

TL;DR

This paper introduces MultiTHUMOS, a densely labeled action dataset for complex videos and proposes a novel LSTM-based model to improve action recognition accuracy and understanding.

Contribution

The paper extends the THUMOS dataset to include dense labels and develops a new LSTM variant for modeling temporal relations in action recognition.

Findings

01

Dense labeling improves action recognition accuracy.

02

The proposed LSTM variant effectively models temporal relations.

03

Enhanced understanding tasks like retrieval and prediction are enabled.

Abstract

Every moment counts in action recognition. A comprehensive understanding of human activity in video requires labeling every frame according to the actions occurring, placing multiple labels densely over a video sequence. To study this problem we extend the existing THUMOS dataset and introduce MultiTHUMOS, a new dataset of dense labels over unconstrained internet videos. Modeling multiple, dense labels benefits from temporal relations within and across classes. We define a novel variant of long short-term memory (LSTM) deep networks for modeling these temporal relations via multiple input and output connections. We show that this model improves action labeling accuracy and further enables deeper understanding tasks ranging from structured retrieval to action prediction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lauradhatt/Interesting-Reads
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Anomaly Detection Techniques and Applications