The Staged Knowledge Distillation in Video Classification: Harmonizing Student Progress by a Complementary Weakly Supervised Framework
Chao Wang, Zheng Tang

TL;DR
This paper introduces a weakly supervised, staged knowledge distillation framework for video classification that enhances student model efficiency and accuracy by leveraging substage-based learning, cascade training, and pseudo-label optimization.
Contribution
It proposes a novel staged distillation approach with substage correlation and a pseudo-label strategy, addressing the large capacity gap issue in video classification.
Findings
Outperforms existing distillation methods on real and simulated datasets
Improves efficiency and accuracy of student models in video classification
Demonstrates the effectiveness of substage-based learning in knowledge distillation
Abstract
In the context of label-efficient learning on video data, the distillation method and the structural design of the teacher-student architecture have a significant impact on knowledge distillation. However, the relationship between these factors has been overlooked in previous research. To address this gap, we propose a new weakly supervised learning framework for knowledge distillation in video classification that is designed to improve the efficiency and accuracy of the student model. Our approach leverages the concept of substage-based learning to distill knowledge based on the combination of student substages and the correlation of corresponding substages. We also employ the progressive cascade training method to address the accuracy loss caused by the large capacity gap between the teacher and the student. Additionally, we propose a pseudo-label optimization strategy to improve the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
MethodsKnowledge Distillation
