Investigation of Different Skeleton Features for CNN-based 3D Action Recognition
Zewei Ding, Pichao Wang, Philip O. Ogunbona, Wanqing Li

TL;DR
This paper explores the effectiveness of various skeleton features encoded as images for CNN-based 3D action recognition, achieving state-of-the-art results on benchmark datasets.
Contribution
It introduces a novel encoding of five spatial skeleton features into images and studies the impact of different joint selections for improved CNN-based recognition.
Findings
Achieved 75.32% accuracy in large-scale challenge
State-of-the-art performance on NTU RGB+D dataset
Identified effective skeleton features for CNN models
Abstract
Deep learning techniques are being used in skeleton based action recognition tasks and outstanding performance has been reported. Compared with RNN based methods which tend to overemphasize temporal information, CNN-based approaches can jointly capture spatio-temporal information from texture color images encoded from skeleton sequences. There are several skeleton-based features that have proven effective in RNN-based and handcrafted-feature-based methods. However, it remains unknown whether they are suitable for CNN-based approaches. This paper proposes to encode five spatial skeleton features into images with different encoding methods. In addition, the performance implication of different joints used for feature extraction is studied. The proposed method achieved state-of-the-art performance on NTU RGB+D dataset for 3D human action analysis. An accuracy of 75.32\% was achieved in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Gait Recognition and Analysis
