Self-Supervised Skeleton-Based Action Representation Learning: A Benchmark and Beyond

Jiahang Zhang; Lilang Lin; Shuai Yang; Jiaying Liu

arXiv:2406.02978·cs.CV·December 29, 2025·1 cites

Self-Supervised Skeleton-Based Action Representation Learning: A Benchmark and Beyond

Jiahang Zhang, Lilang Lin, Shuai Yang, Jiaying Liu

PDF

Open Access 1 Repo

TL;DR

This paper provides a comprehensive survey and benchmark of self-supervised learning methods for skeleton-based action recognition, introduces a novel SSL approach that enhances generalization across multiple tasks, and demonstrates its effectiveness through extensive experiments.

Contribution

It offers the first systematic review and benchmarking of skeleton SSL methods, and proposes a new multi-granularity SSL technique that improves generalization for various downstream tasks.

Findings

01

Most SSL methods rely on a single paradigm and are evaluated only on recognition.

02

The proposed SSL method significantly improves performance across recognition, retrieval, detection, and few-shot learning.

03

Extensive experiments on large datasets validate the effectiveness of the new approach.

Abstract

Self-supervised learning (SSL), which aims to learn meaningful prior representations from unlabeled data, has been proven effective for skeleton-based action understanding. Different from the image domain, skeleton data possesses sparser spatial structures and diverse representation forms, with the absence of background clues and the additional temporal dimension, presenting new challenges for spatial-temporal motion pretext task design. Recently, many endeavors have been made for skeleton-based SSL, achieving remarkable progress. However, a systematic and thorough review is still lacking. In this paper, we conduct, for the first time, a comprehensive survey on self-supervised skeleton-based action representation learning. Following the taxonomy of context-based, generative learning, and contrastive learning approaches, we make a thorough review and benchmark of existing works and shed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jhang2020/pcm3
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications

MethodsContrastive Learning