T-RECS: Training for Rate-Invariant Embeddings by Controlling Speed for   Action Recognition

Madan Ravi Ganesh; Eric Hofesmann; Byungsu Min; Nadha Gafoor; Jason; J. Corso

arXiv:1803.08094·cs.CV·March 26, 2018

T-RECS: Training for Rate-Invariant Embeddings by Controlling Speed for Action Recognition

Madan Ravi Ganesh, Eric Hofesmann, Byungsu Min, Nadha Gafoor, Jason, J. Corso

PDF

Open Access

TL;DR

This paper introduces T-RECS, a preprocessing technique that enhances action recognition models by making them invariant to input video speed variations, improving accuracy and stability across diverse datasets.

Contribution

The paper proposes T-RECS, a speed-adaptive input resampling method that improves rate-invariance in deep action recognition models, applicable to multiple architectures.

Findings

01

T-RECS improves I3D model performance by at least 2.9% on HMDB51.

02

T-RECS increases stability of C3D models by 59% on HMDB51.

03

T-RECS is model-agnostic and effective across different architectures.

Abstract

An action should remain identifiable when modifying its speed: consider the contrast between an expert chef and a novice chef each chopping an onion. Here, we expect the novice chef to have a relatively measured and slow approach to chopping when compared to the expert. In general, the speed at which actions are performed, whether slower or faster than average, should not dictate how they are recognized. We explore the erratic behavior caused by this phenomena on state-of-the-art deep network-based methods for action recognition in terms of maximum performance and stability in recognition accuracy across a range of input video speeds. By observing the trends in these metrics and summarizing them based on expected temporal behaviour w.r.t. variations in input video speeds, we find two distinct types of network architectures. In this paper, we propose a preprocessing method named T-RECS,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings