Simple Unsupervised Knowledge Distillation With Space Similarity

Aditya Singh; Haohan Wang

arXiv:2409.13939·cs.AI·September 24, 2024

Simple Unsupervised Knowledge Distillation With Space Similarity

Aditya Singh, Haohan Wang

PDF

Open Access

TL;DR

This paper introduces a simple unsupervised knowledge distillation method that encourages a student network to model the teacher's embedding manifold using space similarity, improving preservation of the teacher's latent space.

Contribution

It proposes a novel space similarity loss that captures the teacher's embedding manifold more effectively than prior methods relying solely on normalized features.

Findings

01

Outperforms existing UKD methods on multiple benchmarks.

02

Effectively preserves the teacher's latent manifold.

03

Enhances student network performance without labeled data.

Abstract

As per recent studies, Self-supervised learning (SSL) does not readily extend to smaller architectures. One direction to mitigate this shortcoming while simultaneously training a smaller network without labels is to adopt unsupervised knowledge distillation (UKD). Existing UKD approaches handcraft preservation worthy inter/intra sample relationships between the teacher and its student. However, this may overlook/ignore other key relationships present in the mapping of a teacher. In this paper, instead of heuristically constructing preservation worthy relationships between samples, we directly motivate the student to model the teacher's embedding manifold. If the mapped manifold is similar, all inter/intra sample relationships are indirectly conserved. We first demonstrate that prior methods cannot preserve teacher's latent manifold due to their sole reliance on $L_{2}$ normalised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsKnowledge Distillation