ReL-SAR: Representation Learning for Skeleton Action Recognition with Convolutional Transformers and BYOL
Safwen Naimi, Wassim Bouachir, Guillaume-Alexandre Bilodeau

TL;DR
ReL-SAR introduces an unsupervised learning framework combining convolutional transformers and BYOL to improve skeleton action recognition, especially on limited data, by capturing spatial-temporal features efficiently.
Contribution
The paper proposes a novel lightweight convolutional transformer framework with a joint spatial-temporal modeling approach and a selection-permutation strategy, leveraging BYOL for unsupervised skeleton action recognition.
Findings
Achieved competitive results on multiple datasets.
Outperformed state-of-the-art methods in accuracy.
Demonstrated high computational efficiency.
Abstract
To extract robust and generalizable skeleton action recognition features, large amounts of well-curated data are typically required, which is a challenging task hindered by annotation and computation costs. Therefore, unsupervised representation learning is of prime importance to leverage unlabeled skeleton data. In this work, we investigate unsupervised representation learning for skeleton action recognition. For this purpose, we designed a lightweight convolutional transformer framework, named ReL-SAR, exploiting the complementarity of convolutional and attention layers for jointly modeling spatial and temporal cues in skeleton sequences. We also use a Selection-Permutation strategy for skeleton joints to ensure more informative descriptions from skeletal data. Finally, we capitalize on Bootstrap Your Own Latent (BYOL) to learn robust representations from unlabeled skeleton sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications
MethodsSoftmax · Attention Is All You Need
