CKD: Contrastive Knowledge Distillation from A Sample-wise Perspective
Wencheng Zhu, Xin Zhou, Pengfei Zhu, Yu Wang, Qinghua Hu

TL;DR
This paper introduces CKD, a contrastive knowledge distillation method that aligns teacher-student logits at the sample level, effectively preserving semantic relationships and improving performance across multiple vision tasks.
Contribution
It presents a novel contrastive distillation framework that combines intra-sample logit alignment with inter-sample semantic contrast, reducing complexity and dependency on large batch sizes.
Findings
Improves image classification accuracy on CIFAR-100 and ImageNet-1K.
Enhances object detection and segmentation performance on MS COCO.
Reduces computational complexity compared to traditional contrastive methods.
Abstract
In this paper, we propose a simple yet effective contrastive knowledge distillation framework that achieves sample-wise logit alignment while preserving semantic consistency. Conventional knowledge distillation approaches exhibit over-reliance on feature similarity per sample, which risks overfitting, and contrastive approaches focus on inter-class discrimination at the expense of intra-sample semantic relationships. Our approach transfers "dark knowledge" through teacher-student contrastive alignment at the sample level. Specifically, our method first enforces intra-sample alignment by directly minimizing teacher-student logit discrepancies within individual samples. Then, we utilize inter-sample contrasts to preserve semantic dissimilarities across samples. By redefining positive pairs as aligned teacher-student logits from identical samples and negative pairs as cross-sample logit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare
MethodsInfoNCE · Knowledge Distillation · Contrastive Learning
