Decoupled Spatial-Temporal Attention Network for Skeleton-Based Action Recognition
Lei Shi, Yifan Zhang, Jian Cheng, Hanqing Lu

TL;DR
This paper introduces DSTA-Net, a novel attention-based model for skeleton-based action recognition that models spatial-temporal dependencies without relying on hand-crafted graph structures, achieving state-of-the-art results.
Contribution
The work proposes a decoupled spatial-temporal attention network with new techniques for attention decoupling and data decoupling, improving generalization and performance in action recognition.
Findings
Achieves state-of-the-art performance on four challenging datasets.
Effectively models spatial-temporal dependencies without graph structures.
Demonstrates robustness across multiple skeleton-based action recognition tasks.
Abstract
Dynamic skeletal data, represented as the 2D/3D coordinates of human joints, has been widely studied for human action recognition due to its high-level semantic information and environmental robustness. However, previous methods heavily rely on designing hand-crafted traversal rules or graph topologies to draw dependencies between the joints, which are limited in performance and generalizability. In this work, we present a novel decoupled spatial-temporal attention network(DSTA-Net) for skeleton-based action recognition. It involves solely the attention blocks, allowing for modeling spatial-temporal dependencies between joints without the requirement of knowing their positions or mutual connections. Specifically, to meet the specific requirements of the skeletal data, three techniques are proposed for building attention blocks, namely, spatial-temporal attention decoupling, decoupled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Hand Gesture Recognition Systems · Gait Recognition and Analysis
