UHKD: A Unified Framework for Heterogeneous Knowledge Distillation via Frequency-Domain Representations
Fengming Yu, Haiwei Pan, Kejia Zhang, Jian Guan, Haiying Jiang

TL;DR
UHKD introduces a novel frequency-domain approach for heterogeneous knowledge distillation, effectively transferring semantic information across different model architectures to improve compression without significant accuracy loss.
Contribution
The paper proposes a unified framework leveraging frequency-domain features and alignment modules to enhance heterogeneous knowledge distillation, addressing limitations of prior methods focused on logits.
Findings
Achieves up to 5.59% accuracy improvement on CIFAR-100.
Achieves up to 0.83% accuracy improvement on ImageNet-1K.
Effective in transferring semantic knowledge across diverse architectures.
Abstract
Knowledge distillation (KD) is an effective model compression technique that transfers knowledge from a high-performance teacher to a lightweight student, reducing computational and storage costs while maintaining competitive accuracy. However, most existing KD methods are tailored for homogeneous models and perform poorly in heterogeneous settings, particularly when intermediate features are involved. Semantic discrepancies across architectures hinder effective use of intermediate representations from the teacher model, while prior heterogeneous KD studies mainly focus on the logits space, underutilizing rich semantic information in intermediate layers. To address this, Unified Heterogeneous Knowledge Distillation (UHKD) is proposed, a framework that leverages intermediate features in the frequency domain for cross-architecture transfer. Frequency-domain representations are leveraged…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
