Human Action Recognition Based on Multi-scale Feature Maps from Depth Video Sequences
Chang Li, Qian Huang, Xing Li, Qianhan Wu

TL;DR
This paper introduces a multi-scale feature map framework using Laplacian pyramid depth motion images for improved human action recognition from depth videos, outperforming existing methods.
Contribution
It presents a novel multi-scale motion representation (LP-DMI) and a multi-granularity descriptor (LP-DMI-HOG) for enhanced action recognition accuracy.
Findings
Achieved over 93% accuracy on MSRAction3D dataset
Outperformed state-of-the-art benchmarks
Demonstrated effectiveness of multi-scale features in depth video analysis
Abstract
Human action recognition is an active research area in computer vision. Although great process has been made, previous methods mostly recognize actions based on depth data at only one scale, and thus they often neglect multi-scale features that provide additional information action recognition in practical application scenarios. In this paper, we present a novel framework focusing on multi-scale motion information to recognize human actions from depth video sequences. We propose a multi-scale feature map called Laplacian pyramid depth motion images(LP-DMI). We employ depth motion images (DMI) as the templates to generate the multi-scale static representation of actions. Then, we caculate LP-DMI to enhance multi-scale dynamic information of motions and reduces redundant static information in human bodies. We further extract the multi-granularity descriptor called LP-DMI-HOG to provide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Video Surveillance and Tracking Methods · AI and Multimedia in Education
MethodsLaplacian Pyramid
