Generalized Knowledge Distillation via Relationship Matching
Han-Jia Ye, Su Lu, De-Chuan Zhan

TL;DR
This paper introduces a generalized knowledge distillation method called REFILLED that leverages relationship matching between instances, enabling effective transfer from teachers with different or overlapping class spaces, improving various learning tasks.
Contribution
The paper proposes REFILLED, a novel relationship matching approach for generalized knowledge distillation that decouples embedding and classifier flows, handling diverse class relationships.
Findings
Achieves state-of-the-art results in standard distillation tasks.
Demonstrates strong discriminative ability across class overlap scenarios.
Effective in incremental and few-shot learning settings.
Abstract
The knowledge of a well-trained deep neural network (a.k.a. the "teacher") is valuable for learning similar tasks. Knowledge distillation extracts knowledge from the teacher and integrates it with the target model (a.k.a. the "student"), which expands the student's knowledge and improves its learning efficacy. Instead of enforcing the teacher to work on the same task as the student, we borrow the knowledge from a teacher trained from a general label space -- in this "Generalized Knowledge Distillation (GKD)", the classes of the teacher and the student may be the same, completely different, or partially overlapped. We claim that the comparison ability between instances acts as an essential factor threading knowledge across tasks, and propose the RElationship FacIlitated Local cLassifiEr Distillation (REFILLED) approach, which decouples the GKD flow of the embedding and the top-layer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Text and Document Classification Technologies
MethodsKnowledge Distillation
