Aligning in a Compact Space: Contrastive Knowledge Distillation between Heterogeneous Architectures
Hongjun Wu, Li Xiao, Xingkuo Zhang, Yining Miao

TL;DR
This paper introduces LFCC, a novel contrastive knowledge distillation method that aligns low-frequency features across heterogeneous neural network architectures, significantly improving performance on image classification benchmarks.
Contribution
The paper proposes a low-frequency component-based contrastive distillation framework that effectively bridges architectural differences in neural networks, enhancing feature alignment and model performance.
Findings
LFCC outperforms existing methods on ImageNet-1K and CIFAR-100.
It effectively aligns features across CNNs, Transformers, and MLPs.
The approach improves student model accuracy with heterogeneous teachers.
Abstract
Knowledge distillation is commonly employed to compress neural networks, reducing the inference costs and memory footprint. In the scenario of homogenous architecture, feature-based methods have been widely validated for their effectiveness. However, in scenarios where the teacher and student models are of heterogeneous architectures, the inherent differences in feature representation significantly degrade the performance of these methods. Recent studies have highlighted that low-frequency components constitute the majority of image features. Motivated by this, we propose a Low-Frequency Components-based Contrastive Knowledge Distillation (LFCC) framework that significantly enhances the performance of feature-based distillation between heterogeneous architectures. Specifically, we designe a set of multi-scale low-pass filters to extract the low-frequency components of intermediate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArchitecture and Computational Design · Architecture and Art History Studies
MethodsSparse Evolutionary Training · Contrastive Learning · Knowledge Distillation
