Human-Centered Prior-Guided and Task-Dependent Multi-Task Representation Learning for Action Recognition Pre-Training
Guanhong Wang, Keyu Lu, Yang Zhou, Zhanhao He, Gaoang Wang

TL;DR
This paper introduces a human-centered, task-dependent multi-task pre-training framework for action recognition that leverages human parsing knowledge and combines knowledge distillation with contrastive learning to improve representation quality and task specificity.
Contribution
It proposes a novel pre-training approach that incorporates human prior knowledge and task-dependent representations to enhance action recognition performance.
Findings
Achieves state-of-the-art results on UCF101 and HMDB51 benchmarks.
Effectively integrates human parsing knowledge into self-supervised learning.
Addresses multi-task conflicts with task-dependent representations.
Abstract
Recently, much progress has been made for self-supervised action recognition. Most existing approaches emphasize the contrastive relations among videos, including appearance and motion consistency. However, two main issues remain for existing pre-training methods: 1) the learned representation is neutral and not informative for a specific task; 2) multi-task learning-based pre-training sometimes leads to sub-optimal solutions due to inconsistent domains of different tasks. To address the above issues, we propose a novel action recognition pre-training framework, which exploits human-centered prior knowledge that generates more informative representation, and avoids the conflict between multiple tasks by using task-dependent representations. Specifically, we distill knowledge from a human parsing model to enrich the semantic capability of representation. In addition, we combine knowledge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Medical Imaging and Analysis · Multimodal Machine Learning Applications
MethodsKnowledge Distillation · Contrastive Learning
