Harmonizing knowledge Transfer in Neural Network with Unified   Distillation

Yaomin Huang; Zaomin Yan; Chaomin Shen; Faming Fang; and Guixu Zhang

arXiv:2409.18565·cs.CV·September 30, 2024

Harmonizing knowledge Transfer in Neural Network with Unified Distillation

Yaomin Huang, Zaomin Yan, Chaomin Shen, Faming Fang, and Guixu Zhang

PDF

Open Access

TL;DR

This paper proposes a unified knowledge distillation framework that aggregates features from multiple layers to transfer comprehensive semantic knowledge from teacher to student neural networks.

Contribution

It introduces a novel approach that combines feature-based and logits-based distillation by aggregating intermediate features into a unified representation for knowledge transfer.

Findings

01

Improved student network performance across various tasks.

02

Effective aggregation of multi-layer features enhances knowledge transfer.

03

Unified distribution constraint ensures coherent knowledge distillation.

Abstract

Knowledge distillation (KD), known for its ability to transfer knowledge from a cumbersome network (teacher) to a lightweight one (student) without altering the architecture, has been garnering increasing attention. Two primary categories emerge within KD methods: feature-based, focusing on intermediate layers' features, and logits-based, targeting the final layer's logits. This paper introduces a novel perspective by leveraging diverse knowledge sources within a unified KD framework. Specifically, we aggregate features from intermediate layers into a comprehensive representation, effectively gathering semantic information from different stages and scales. Subsequently, we predict the distribution parameters from this representation. These steps transform knowledge from the intermediate layers into corresponding distributive forms, thereby allowing for knowledge distillation through a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsKnowledge Distillation