Relational Representation Distillation

Nikolaos Giakoumoglou; Tania Stathaki

arXiv:2407.12073·cs.CV·May 14, 2025·1 cites

Relational Representation Distillation

Nikolaos Giakoumoglou, Tania Stathaki

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel knowledge distillation method that preserves relational structures in internal representations, outperforming existing methods by better capturing the relationships between instances.

Contribution

It proposes a new objective that maintains relative relationships between instances using separate temperature parameters, bridging contrastive learning and KL divergence.

Findings

01

Outperforms existing distillation methods across various tasks.

02

Achieves better alignment with teacher models.

03

Sometimes surpasses teacher network performance.

Abstract

Knowledge distillation involves transferring knowledge from large, cumbersome teacher models to more compact student models. The standard approach minimizes the Kullback-Leibler (KL) divergence between the probabilistic outputs of a teacher and student network. However, this approach fails to capture important structural relationships in the teacher's internal representations. Recent advances have turned to contrastive learning objectives, but these methods impose overly strict constraints through instance-discrimination, forcing apart semantically similar samples even when they should maintain similarity. This motivates an alternative objective by which we preserve relative relationships between instances. Our method employs separate temperature parameters for teacher and student distributions, with sharper student outputs, enabling precise learning of primary relationships while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

giakoumoglou/distillers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Advanced Graph Neural Networks · Domain Adaptation and Few-Shot Learning

MethodsKnowledge Distillation · InfoNCE · Contrastive Learning · Focus