Training Shallow and Thin Networks for Acceleration via Knowledge   Distillation with Conditional Adversarial Networks

Zheng Xu; Yen-Chang Hsu; Jiawei Huang

arXiv:1709.00513·cs.LG·April 18, 2018·26 cites

Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks

Zheng Xu, Yen-Chang Hsu, Jiawei Huang

PDF

Open Access

TL;DR

This paper introduces a novel knowledge distillation method using conditional adversarial networks to efficiently train small, fast neural networks with improved accuracy for real-time applications.

Contribution

It proposes a new adversarial approach for knowledge transfer that enhances training of shallow, thin networks, especially effective for small student models.

Findings

01

Improved accuracy of small networks via adversarial knowledge distillation

02

Effective training of small networks with reduced inference time

03

Guidelines for selecting appropriate student network sizes

Abstract

There is an increasing interest on accelerating neural networks for real-time applications. We study the student-teacher strategy, in which a small and fast student network is trained with the auxiliary information learned from a large and accurate teacher network. We propose to use conditional adversarial networks to learn the loss function to transfer knowledge from teacher to student. The proposed method is particularly effective for relatively small student networks. Moreover, experimental results show the effect of network size when the modern networks are used as student. We empirically study the trade-off between inference time and classification accuracy, and provide suggestions on choosing a proper student network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning