USDRL: Unified Skeleton-Based Dense Representation Learning with Multi-Grained Feature Decorrelation
Wanjiang Weng, Hongsong Wang, Junbo Wang, Lei He, Guosen Xie

TL;DR
USDRL introduces a novel skeleton-based dense representation learning framework that employs multi-grained feature decorrelation and a dense spatio-temporal encoder, significantly improving performance on various action recognition and detection benchmarks.
Contribution
The paper proposes a unified dense representation learning method with feature decorrelation and a specialized encoder, addressing limitations of negative-based contrastive learning and enhancing dense prediction capabilities.
Findings
Outperforms state-of-the-art on NTU-60, NTU-120, PKU-MMD I, and PKU-MMD II datasets.
Effectively captures fine-grained action features for dense prediction tasks.
Demonstrates significant improvements in action recognition, retrieval, and detection.
Abstract
Contrastive learning has achieved great success in skeleton-based representation learning recently. However, the prevailing methods are predominantly negative-based, necessitating additional momentum encoder and memory bank to get negative samples, which increases the difficulty of model training. Furthermore, these methods primarily concentrate on learning a global representation for recognition and retrieval tasks, while overlooking the rich and detailed local representations that are crucial for dense prediction tasks. To alleviate these issues, we introduce a Unified Skeleton-based Dense Representation Learning framework based on feature decorrelation, called USDRL, which employs feature decorrelation across temporal, spatial, and instance domains in a multi-grained manner to reduce redundancy among dimensions of the representations to maximize information extraction from features.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Natural Language Processing Techniques · Human Pose and Action Recognition
