Revisiting Cross-Architecture Distillation: Adaptive Dual-Teacher Transfer for Lightweight Video Models
Ying Peng, Hongsen Ye, Changxin Huang, Xiping Hu, Jian Chen, Runhao Zeng

TL;DR
This paper introduces a dual-teacher knowledge distillation method that combines heterogeneous ViT and homogeneous CNN teachers to improve lightweight CNN video models, addressing architectural mismatch and enhancing accuracy.
Contribution
It proposes a novel dual-teacher framework with adaptive weighting and residual learning strategies to better transfer knowledge from different teacher architectures to CNN students.
Findings
Outperforms existing distillation methods on multiple benchmarks.
Achieves up to 5.95% accuracy gain on HMDB51.
Demonstrates effectiveness of adaptive teacher fusion and residual learning.
Abstract
Vision Transformers (ViTs) have achieved strong performance in video action recognition, but their high computational cost limits their practicality. Lightweight CNNs are more efficient but suffer from accuracy gaps. Cross-Architecture Knowledge Distillation (CAKD) addresses this by transferring knowledge from ViTs to CNNs, yet existing methods often struggle with architectural mismatch and overlook the value of stronger homogeneous CNN teachers. To tackle these challenges, we propose a Dual-Teacher Knowledge Distillation framework that leverages both a heterogeneous ViT teacher and a homogeneous CNN teacher to collaboratively guide a lightweight CNN student. We introduce two key components: (1) Discrepancy-Aware Teacher Weighting, which dynamically fuses the predictions from ViT and CNN teachers by assigning adaptive weights based on teacher confidence and prediction discrepancy with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Neural Network Applications · Robot Manipulation and Learning
