Skeleton-Based Action Recognition with Spatial Reasoning and Temporal Stack Learning
Chenyang Si, Ya Jing, Wei Wang, Liang Wang, Tieniu Tan

TL;DR
This paper introduces a novel skeleton-based action recognition model combining spatial reasoning with temporal stack learning, significantly improving accuracy by capturing detailed spatial and temporal features.
Contribution
The proposed SR-TSL model integrates a residual graph neural network for spatial structure and skip-clip LSTMs for temporal dynamics, with a new clip-based incremental loss for training.
Findings
Achieves superior accuracy on NTU RGB+D dataset
Outperforms state-of-the-art methods
Effectively models spatial and temporal features
Abstract
Skeleton-based action recognition has made great progress recently, but many problems still remain unsolved. For example, most of the previous methods model the representations of skeleton sequences without abundant spatial structure information and detailed temporal dynamics features. In this paper, we propose a novel model with spatial reasoning and temporal stack learning (SR-TSL) for skeleton based action recognition, which consists of a spatial reasoning network (SRN) and a temporal stack learning network (TSLN). The SRN can capture the high-level spatial structural information within each frame by a residual graph neural network, while the TSLN can model the detailed temporal dynamics of skeleton sequences by a composition of multiple skip-clip LSTMs. During training, we propose a clip-based incremental loss to optimize the model. We perform extensive experiments on the SYSU 3D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Gait Recognition and Analysis
