FitNets: Hints for Thin Deep Nets

Adriana Romero; Nicolas Ballas; Samira Ebrahimi Kahou; Antoine; Chassang; Carlo Gatta; Yoshua Bengio

arXiv:1412.6550·cs.LG·March 30, 2015·ICLR·2.0k cites

FitNets: Hints for Thin Deep Nets

Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine, Chassang, Carlo Gatta, Yoshua Bengio

PDF

Open Access 4 Repos 1 Video

TL;DR

This paper introduces FitNets, a method for training thinner, deeper neural networks by using intermediate hints from larger teacher networks, resulting in improved performance and efficiency.

Contribution

It extends knowledge distillation by incorporating intermediate representations as hints, enabling training of deeper, thinner networks with better generalization and speed.

Findings

01

Deep student networks outperform larger teachers on CIFAR-10.

02

Intermediate hints improve training of thin deep networks.

03

Deeper students with fewer parameters achieve better accuracy.

Abstract

While depth tends to improve network performances, it also makes gradient-based training more difficult since deeper networks tend to be more non-linear. The recently proposed knowledge distillation approach is aimed at obtaining small and fast-to-execute models, and it has shown that a student network could imitate the soft output of a larger teacher network or ensemble of networks. In this paper, we extend this idea to allow the training of a student that is deeper and thinner than the teacher, using not only the outputs but also the intermediate representations learned by the teacher as hints to improve the training process and final performance of the student. Because the student intermediate hidden layer will generally be smaller than the teacher's intermediate hidden layer, additional parameters are introduced to map the student hidden layer to the prediction of the teacher hidden…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

FitNets: Hints for Thin Deep Nets· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition