Neural Collapse Inspired Knowledge Distillation

Shuoxi Zhang; Zijian Song; Kun He

arXiv:2412.11788·cs.CV·December 17, 2024

Neural Collapse Inspired Knowledge Distillation

Shuoxi Zhang, Zijian Song, Kun He

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel knowledge distillation method inspired by Neural Collapse, which enhances student model performance by transferring the teacher's geometric feature structure, leading to improved generalization and state-of-the-art results.

Contribution

The paper proposes a new distillation paradigm that incorporates Neural Collapse structure transfer, offering a more effective way to bridge the knowledge gap between teacher and student.

Findings

01

NCKD improves student model accuracy.

02

Transferring NC structure enhances generalization.

03

Achieves state-of-the-art performance.

Abstract

Existing knowledge distillation (KD) methods have demonstrated their ability in achieving student network performance on par with their teachers. However, the knowledge gap between the teacher and student remains significant and may hinder the effectiveness of the distillation process. In this work, we introduce the structure of Neural Collapse (NC) into the KD framework. NC typically occurs in the final phase of training, resulting in a graceful geometric structure where the last-layer features form a simplex equiangular tight frame. Such phenomenon has improved the generalization of deep network training. We hypothesize that NC can also alleviate the knowledge gap in distillation, thereby enhancing student performance. This paper begins with an empirical analysis to bridge the connection between knowledge distillation and neural collapse. Through this analysis, we establish that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Neural Collapse Inspired Knowledge Distillation· underline

Taxonomy

TopicsNeural Networks and Applications

MethodsKnowledge Distillation