Online Spatiotemporal Action Detection and Prediction via Causal   Representations

Gurkirt Singh

arXiv:2008.13759·cs.CV·September 1, 2020

Online Spatiotemporal Action Detection and Prediction via Causal Representations

Gurkirt Singh

PDF

Open Access 1 Repo

TL;DR

This paper develops an online, real-time approach for spatiotemporal action detection and prediction in videos, extending offline methods with causal representations to enable future action forecasting.

Contribution

It introduces an online system converting offline detection into real-time action tube detection and extends it for future action prediction using regression, demonstrating causal representations can match offline CNN performance.

Findings

01

Online action tube detection is feasible in real-time.

02

Causal representations achieve comparable accuracy to offline 3D CNNs.

03

The method enables early action prediction and segmentation.

Abstract

In this thesis, we focus on video action understanding problems from an online and real-time processing point of view. We start with the conversion of the traditional offline spatiotemporal action detection pipeline into an online spatiotemporal action tube detection system. An action tube is a set of bounding connected over time, which bounds an action instance in space and time. Next, we explore the future prediction capabilities of such detection methods by extending an existing action tube into the future by regression. Later, we seek to establish that online/causal representations can achieve similar performance to that of offline three dimensional (3D) convolutional neural networks (CNNs) on various tasks, including action recognition, temporal action segmentation and early prediction.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://bitbucket.org/sahasuman/bmvc2016_code
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Analysis and Summarization