Learn From the Past: Experience Ensemble Knowledge Distillation

Chaofei Wang; Shaowei Zhang; Shiji Song; Gao Huang

arXiv:2202.12488·cs.CV·February 28, 2022

Learn From the Past: Experience Ensemble Knowledge Distillation

Chaofei Wang, Shaowei Zhang, Shiji Song, Gao Huang

PDF

Open Access

TL;DR

This paper introduces Experience Ensemble Knowledge Distillation (EEKD), a novel method that leverages the teacher's training experience by ensemble of intermediate models, improving student performance efficiently.

Contribution

The paper proposes a new distillation approach that incorporates teacher's training experience through ensemble of intermediate models with adaptive weighting, outperforming existing methods.

Findings

01

EEKD outperforms mainstream knowledge distillation methods.

02

EEKD surpasses standard ensemble distillation while saving training costs.

03

Strong ensemble teachers do not necessarily produce stronger students.

Abstract

Traditional knowledge distillation transfers "dark knowledge" of a pre-trained teacher network to a student network, and ignores the knowledge in the training process of the teacher, which we call teacher's experience. However, in realistic educational scenarios, learning experience is often more important than learning results. In this work, we propose a novel knowledge distillation method by integrating the teacher's experience for knowledge transfer, named experience ensemble knowledge distillation (EEKD). We save a moderate number of intermediate models from the training process of the teacher model uniformly, and then integrate the knowledge of these intermediate models by ensemble technique. A self-attention module is used to adaptively assign weights to different intermediate models in the process of knowledge transfer. Three principles of constructing EEKD on the quality,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Enhancement Techniques · Advanced Neural Network Applications · Neural Networks and Reservoir Computing

MethodsKnowledge Distillation