Heterogeneous Skeleton-Based Action Representation Learning
Hongsong Wang, Xiaoyan Ma, Jidong Kuang, Jie Gui

TL;DR
This paper introduces a novel framework for learning unified action representations from heterogeneous skeleton data, addressing variations in joint structures and topologies to improve action recognition across diverse sources.
Contribution
It proposes a dual-component method that converts 2D to 3D skeletons, constructs skeleton-specific prompts, and incorporates semantic motion encoding for better heterogeneous skeleton processing.
Findings
Effective on NTU-60, NTU-120, and PKU-MMD II datasets
Improves action recognition accuracy across diverse skeleton sources
Applicable to robots with different humanoid structures
Abstract
Skeleton-based human action recognition has received widespread attention in recent years due to its diverse range of application scenarios. Due to the different sources of human skeletons, skeleton data naturally exhibit heterogeneity. The previous works, however, overlook the heterogeneity of human skeletons and solely construct models tailored for homogeneous skeletons. This work addresses the challenge of heterogeneous skeleton-based action representation learning, specifically focusing on processing skeleton data that varies in joint dimensions and topological structures. The proposed framework comprises two primary components: heterogeneous skeleton processing and unified representation learning. The former first converts two-dimensional skeleton data into three-dimensional skeleton via an auxiliary network, and then constructs a prompted unified skeleton using skeleton-specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications
