TL;DR
This paper introduces a novel spatio-temporal joint density measure to enhance self-supervised skeleton-based action recognition by identifying discriminative joints and aligning their representations, leading to improved performance on multiple datasets.
Contribution
It proposes the spatial-temporal joint density (STJD) and two new learning strategies, STJD-CL and STJD-MP, to better utilize joint interactions for action classification.
Findings
Achieved 3.5-3.6% performance improvements over state-of-the-art methods.
Demonstrated effectiveness across NTU RGB+D 60, NTU RGB+D 120, and PKUMMD datasets.
Enhanced recognition accuracy by focusing on prime joints identified through STJD.
Abstract
Traditional approaches in unsupervised or self supervised learning for skeleton-based action classification have concentrated predominantly on the dynamic aspects of skeletal sequences. Yet, the intricate interaction between the moving and static elements of the skeleton presents a rarely tapped discriminative potential for action classification. This paper introduces a novel measurement, referred to as spatial-temporal joint density (STJD), to quantify such interaction. Tracking the evolution of this density throughout an action can effectively identify a subset of discriminative moving and/or static joints termed "prime joints" to steer self-supervised learning. A new contrastive learning strategy named STJD-CL is proposed to align the representation of a skeleton sequence with that of its prime joints while simultaneously contrasting the representations of prime and nonprime joints.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning · ALIGN
