All You Need in Knowledge Distillation Is a Tailored Coordinate System

Junjie Zhou; Ke Zhu; Jianxin Wu

arXiv:2412.09388·cs.CV·February 13, 2025

All You Need in Knowledge Distillation Is a Tailored Coordinate System

Junjie Zhou, Ke Zhu, Jianxin Wu

PDF

Open Access

TL;DR

This paper introduces a teacher-free knowledge distillation method called Tailored Coordinate System (TCS), which captures dark knowledge through a linear subspace, enabling efficient, flexible, and high-accuracy student training across architectures.

Contribution

The paper proposes TCS, a novel teacher-free KD approach that uses a tailored coordinate system to transfer knowledge, reducing training time and memory while improving accuracy.

Findings

01

TCS outperforms state-of-the-art KD methods in accuracy.

02

TCS requires about half the training time and GPU memory.

03

TCS is effective across diverse architectures and tasks.

Abstract

Knowledge Distillation (KD) is essential in transferring dark knowledge from a large teacher to a small student network, such that the student can be much more efficient than the teacher but with comparable accuracy. Existing KD methods, however, rely on a large teacher trained specifically for the target task, which is both very inflexible and inefficient. In this paper, we argue that a SSL-pretrained model can effectively act as the teacher and its dark knowledge can be captured by the coordinate system or linear subspace where the features lie in. We then need only one forward pass of the teacher, and then tailor the coordinate system (TCS) for the student network. Our TCS method is teacher-free and applies to diverse architectures, works well for KD and practical few-shot learning, and allows cross-architecture distillation with large capacity gap. Experiments show that TCS achieves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotional Intelligence and Performance · Education and Critical Thinking Development