Disentangling and Unifying Graph Convolutions for Skeleton-Based Action   Recognition

Ziyu Liu; Hongwen Zhang; Zhenghao Chen; Zhiyong Wang; Wanli Ouyang

arXiv:2003.14111·cs.CV·May 20, 2020·89 cites

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

Ziyu Liu, Hongwen Zhang, Zhenghao Chen, Zhiyong Wang, Wanli Ouyang

PDF

Open Access 3 Repos 1 Video

TL;DR

This paper introduces MS-G3D, a novel graph convolutional approach that disentangles multi-scale spatial features and unifies spatial-temporal dependencies, significantly improving skeleton-based action recognition performance.

Contribution

It proposes a simple multi-scale graph convolution disentanglement method and a unified G3D operator with dense cross-spacetime edges, advancing feature extraction in action recognition.

Findings

01

Outperforms previous state-of-the-art on NTU RGB+D 60, NTU RGB+D 120, and Kinetics Skeleton 400 datasets.

02

Effectively models long-range joint relationships and complex spatial-temporal dependencies.

03

Enhances robustness and accuracy of skeleton-based action recognition.

Abstract

Spatial-temporal graphs have been widely used by skeleton-based action recognition algorithms to model human action dynamics. To capture robust movement patterns from these graphs, long-range and multi-scale context aggregation and spatial-temporal dependency modeling are critical aspects of a powerful feature extractor. However, existing methods have limitations in achieving (1) unbiased long-range joint relationship modeling under multi-scale operators and (2) unobstructed cross-spacetime information flow for capturing complex spatial-temporal dependencies. In this work, we present (1) a simple method to disentangle multi-scale graph convolutions and (2) a unified spatial-temporal graph convolutional operator named G3D. The proposed multi-scale aggregation scheme disentangles the importance of nodes in different neighborhoods for effective long-range modeling. The proposed G3D module…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition· youtube

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Context-Aware Activity Recognition Systems

MethodsG3D