Exploiting deep residual networks for human action recognition from skeletal data
Huy-Hieu Pham, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio, A. Velastin

TL;DR
This paper introduces a novel deep residual network approach that transforms skeletal data into image representations for human action recognition, achieving state-of-the-art results on multiple benchmark datasets.
Contribution
The paper proposes a new ResNet-based architecture that learns from RGB images derived from skeletal data, improving accuracy and efficiency in action recognition tasks.
Findings
Achieves top performance on MSR Action 3D, KARD, and NTU-RGB+D datasets.
Surpasses previous methods by 3.4%, 0.67%, and 2.5% respectively.
Requires less computational resources than existing approaches.
Abstract
The computer vision community is currently focusing on solving action recognition problems in real videos, which contain thousands of samples with many challenges. In this process, Deep Convolutional Neural Networks (D-CNNs) have played a significant role in advancing the state-of-the-art in various vision-based action recognition systems. Recently, the introduction of residual connections in conjunction with a more traditional CNN model in a single architecture called Residual Network (ResNet) has shown impressive performance and great potential for image recognition tasks. In this paper, we investigate and apply deep ResNets for human action recognition using skeletal data provided by depth sensors. Firstly, the 3D coordinates of the human body joints carried in skeleton sequences are transformed into image-based representations and stored as RGB images. These color images are able to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
