Discriminative and Consistent Representation Distillation

Nikolaos Giakoumoglou; Tania Stathaki

arXiv:2407.11802·cs.CV·May 14, 2025

Discriminative and Consistent Representation Distillation

Nikolaos Giakoumoglou, Tania Stathaki

PDF

Open Access 1 Repo

TL;DR

This paper introduces Discriminative and Consistent Distillation (DCD), a novel knowledge distillation method that combines contrastive loss with consistency regularization, achieving state-of-the-art results and better generalization across datasets.

Contribution

DCD innovatively integrates contrastive learning with adaptive parameters and consistency regularization for improved knowledge distillation.

Findings

01

DCD outperforms existing methods on CIFAR-100 and ImageNet.

02

Student models sometimes surpass teacher accuracy.

03

DCD's representations generalize well to other datasets.

Abstract

Knowledge Distillation (KD) aims to transfer knowledge from a large teacher model to a smaller student model. While contrastive learning has shown promise in self-supervised learning by creating discriminative representations, its application in knowledge distillation remains limited and focuses primarily on discrimination, neglecting the structural relationships captured by the teacher model. To address this limitation, we propose Discriminative and Consistent Distillation (DCD), which employs a contrastive loss along with a consistency regularization to minimize the discrepancy between the distributions of teacher and student representations. Our method introduces learnable temperature and bias parameters that adapt during training to balance these complementary objectives, replacing the fixed hyperparameters commonly used in contrastive learning approaches. Through extensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

giakoumoglou/distillers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsKnowledge Distillation · Causal inference · Contrastive Learning