Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision
Yiping Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng,, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen

TL;DR
This paper introduces a novel framework for skeleton-based action recognition that leverages multi-modality data through an Implicit Knowledge Exchange Module and a relational distillation approach, improving accuracy and efficiency.
Contribution
It proposes an Implicit Knowledge Exchange Module and a relational cross-modality knowledge distillation framework to enhance multi-modality skeleton-based action recognition.
Findings
Improved recognition accuracy with multi-modality data.
Effective knowledge propagation control between modalities.
Enhanced efficiency with the teacher-student distillation framework.
Abstract
Self-supervised representation learning for human action recognition has developed rapidly in recent years. Most of the existing works are based on skeleton data while using a multi-modality setup. These works overlooked the differences in performance among modalities, which led to the propagation of erroneous knowledge between modalities while only three fundamental modalities, i.e., joints, bones, and motions are used, hence no additional modalities are explored. In this work, we first propose an Implicit Knowledge Exchange Module (IKEM) which alleviates the propagation of erroneous knowledge between low-performance modalities. Then, we further propose three new modalities to enrich the complementary information between modalities. Finally, to maintain efficiency when introducing new modalities, we propose a novel teacher-student framework to distill the knowledge from the secondary…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Hand Gesture Recognition Systems
