Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action   Recognition

Sijie Yan; Yuanjun Xiong; Dahua Lin

arXiv:1801.07455·cs.CV·January 26, 2018·597 cites

Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition

Sijie Yan, Yuanjun Xiong, Dahua Lin

PDF

Open Access 5 Repos

TL;DR

This paper introduces ST-GCN, a novel deep learning model that automatically learns spatial and temporal patterns from skeleton data, significantly improving human action recognition accuracy.

Contribution

The paper presents a new spatial-temporal graph convolutional network that surpasses previous methods by learning patterns directly from data, enhancing expressiveness and generalization.

Findings

01

Achieves state-of-the-art results on Kinetics and NTU-RGBD datasets.

02

Outperforms traditional hand-crafted and traversal-based methods.

03

Demonstrates strong generalization across large datasets.

Abstract

Dynamics of human body skeletons convey significant information for human action recognition. Conventional approaches for modeling skeletons usually rely on hand-crafted parts or traversal rules, thus resulting in limited expressive power and difficulties of generalization. In this work, we propose a novel model of dynamic skeletons called Spatial-Temporal Graph Convolutional Networks (ST-GCN), which moves beyond the limitations of previous methods by automatically learning both the spatial and temporal patterns from data. This formulation not only leads to greater expressive power but also stronger generalization capability. On two large datasets, Kinetics and NTU-RGBD, it achieves substantial improvements over mainstream methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Gait Recognition and Analysis