Action Recognition Using Volumetric Motion Representations

Michael Peven; Gregory D. Hager; Austin Reiter

arXiv:1911.08511·cs.CV·November 21, 2019

Action Recognition Using Volumetric Motion Representations

Michael Peven, Gregory D. Hager, Austin Reiter

PDF

Open Access 1 Repo

TL;DR

This paper introduces a volumetric 3D motion representation for action recognition, leveraging 3D CNNs and data augmentation to improve accuracy and viewpoint invariance, demonstrated on the NTU RGB+D dataset.

Contribution

It proposes a novel voxelized 3D motion representation for action recognition, enabling real-time processing and improved performance over existing 2D methods.

Findings

01

Outperforms state-of-the-art on NTU RGB+D dataset

02

Enables real-time inference from RGB-D videos

03

Provides viewpoint invariance through out-of-plane augmentation

Abstract

Traditional action recognition models are constructed around the paradigm of 2D perspective imagery. Though sophisticated time-series models have pushed the field forward, much of the information is still not exploited by confining the domain to 2D. In this work, we introduce a novel representation of motion as a voxelized 3D vector field and demonstrate how it can be used to improve performance of action recognition networks. This volumetric representation is a natural fit for 3D CNNs, and allows out-of-plane data augmentation techniques during training of these networks. Both the construction of this representation from RGB-D video and inference can be run in real time. We demonstrate superior results using this representation with our network design on the open-source NTU RGB+D dataset where it outperforms state-of-the-art on both of the defined evaluation metrics. Furthermore, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mpeven/ntu_rgb
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Diabetic Foot Ulcer Assessment and Management