SkeleTR: Towrads Skeleton-based Action Recognition in the Wild
Haodong Duan, Mingze Xu, Bing Shuai, Davide Modolo, Zhuowen Tu, Joseph, Tighe, Alessandro Bergamo

TL;DR
SkeleTR introduces a two-stage skeleton-based action recognition framework that effectively models intra-person dynamics and inter-person interactions, achieving state-of-the-art results in diverse scenarios.
Contribution
It proposes a novel two-stage approach combining graph convolutions and Transformer encoders for flexible, multi-task skeleton-based action recognition in unconstrained environments.
Findings
Achieves state-of-the-art performance on multiple benchmarks.
Effectively handles variable numbers of people and interactions.
Improves transfer learning and joint training across tasks.
Abstract
We present SkeleTR, a new framework for skeleton-based action recognition. In contrast to prior work, which focuses mainly on controlled environments, we target more general scenarios that typically involve a variable number of people and various forms of interaction between people. SkeleTR works with a two-stage paradigm. It first models the intra-person skeleton dynamics for each skeleton sequence with graph convolutions, and then uses stacked Transformer encoders to capture person interactions that are important for action recognition in general scenarios. To mitigate the negative impact of inaccurate skeleton associations, SkeleTR takes relative short skeleton sequences as input and increases the number of sequences. As a unified solution, SkeleTR can be directly applied to multiple skeleton-based action tasks, including video-level action classification, instance-level action…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Context-Aware Activity Recognition Systems · Anomaly Detection Techniques and Applications
MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Dense Connections · Position-Wise Feed-Forward Layer · Absolute Position Encodings · Residual Connection · Adam · Linear Layer · Dropout
