Cross-Model Cross-Stream Learning for Self-Supervised Human Action Recognition
Mengyuan Liu, Hong Liu, Tianyu Guo

TL;DR
This paper introduces a novel self-supervised learning framework for skeleton-based human action recognition that leverages cross-model and cross-stream strategies to improve feature discrimination and multi-stream information integration.
Contribution
It proposes the CMCS framework combining CMAL and CSCL, and introduces SkeletonBYOL as a baseline for effective self-supervised skeleton action recognition.
Findings
Outperforms state-of-the-art methods on three datasets.
Demonstrates the effectiveness of cross-model adversarial and collaborative learning.
Shows improved discriminative feature learning within single streams.
Abstract
Considering the instance-level discriminative ability, contrastive learning methods, including MoCo and SimCLR, have been adapted from the original image representation learning task to solve the self-supervised skeleton-based action recognition task. These methods usually use multiple data streams (i.e., joint, motion, and bone) for ensemble learning, meanwhile, how to construct a discriminative feature space within a single stream and effectively aggregate the information from multiple streams remains an open problem. To this end, this paper first applies a new contrastive learning method called BYOL to learn from skeleton data, and then formulate SkeletonBYOL as a simple yet effective baseline for self-supervised skeleton-based action recognition. Inspired by SkeletonBYOL, this paper further presents a Cross-Model and Cross-Stream (CMCS) framework. This framework combines Cross-Model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Multimodal Machine Learning Applications
MethodsBitcoin Customer Service Number +1-833-534-1729 · Average Pooling · Batch Normalization · Global Average Pooling · 1x1 Convolution · Normalized Temperature-scaled Cross Entropy Loss · Max Pooling · Residual Connection · Residual Block · InfoNCE
