RGB-D Based Action Recognition with Light-weight 3D Convolutional Networks
Haokui Zhang, Ying Li, Peng Wang, Yu Liu, Chunhua Shen

TL;DR
This paper introduces lightweight 3D convolutional neural networks for RGB-D action recognition, reducing model complexity while maintaining or improving accuracy on benchmark datasets.
Contribution
The paper proposes novel lightweight 3D-CNN architectures tailored for RGB-D data, achieving high accuracy with fewer parameters and lower computational costs.
Findings
Achieved 93.2% and 97.6% accuracy on NTU dataset
Achieved 95.5% accuracy on N-UCLA dataset
Models outperform or match state-of-the-art methods
Abstract
Different from RGB videos, depth data in RGB-D videos provide key complementary information for tristimulus visual data which potentially could achieve accuracy improvement for action recognition. However, most of the existing action recognition models solely using RGB videos limit the performance capacity. Additionally, the state-of-the-art action recognition models, namely 3D convolutional neural networks (3D-CNNs) contain tremendous parameters suffering from computational inefficiency. In this paper, we propose a series of 3D light-weight architectures for action recognition based on RGB-D data. Compared with conventional 3D-CNN models, the proposed light-weight 3D-CNNs have considerably less parameters involving lower computation cost, while it results in favorable recognition performance. Experimental results on two public benchmark datasets show that our models can approximate or…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Gait Recognition and Analysis · Video Surveillance and Tracking Methods
