DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action   Recognition

Haodong Duan; Jiaqi Wang; Kai Chen; Dahua Lin

arXiv:2210.05895·cs.CV·October 13, 2022·42 cites

DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition

Haodong Duan, Jiaqi Wang, Kai Chen, Dahua Lin

PDF

Open Access 3 Repos

TL;DR

This paper introduces DG-STGCN, a dynamic graph convolutional network that adaptively models spatial and temporal features for skeleton-based action recognition, surpassing previous methods by learning flexible joint relationships.

Contribution

The paper proposes a novel framework with learned affinity matrices for dynamic spatial modeling and group-wise temporal convolutions, improving flexibility and accuracy in skeleton-based action recognition.

Findings

01

Outperforms state-of-the-art on NTURGB+D, Kinetics-Skeleton, BABEL, and Toyota SmartHome datasets.

02

Demonstrates significant accuracy improvements over fixed-structure GCN methods.

03

Validates the effectiveness of dynamic structure learning in capturing complex joint correlations.

Abstract

Graph convolution networks (GCN) have been widely used in skeleton-based action recognition. We note that existing GCN-based approaches primarily rely on prescribed graphical structures (ie., a manually defined topology of skeleton joints), which limits their flexibility to capture complicated correlations between joints. To move beyond this limitation, we propose a new framework for skeleton-based action recognition, namely Dynamic Group Spatio-Temporal GCN (DG-STGCN). It consists of two modules, DG-GCN and DG-TCN, respectively, for spatial and temporal modeling. In particular, DG-GCN uses learned affinity matrices to capture dynamic graphical structures instead of relying on a prescribed one, while DG-TCN performs group-wise temporal convolutions with varying receptive fields and incorporates a dynamic joint-skeleton fusion module for adaptive multi-level temporal modeling. On a wide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Stroke Rehabilitation and Recovery · Context-Aware Activity Recognition Systems

MethodsGraph Convolutional Network · Convolution