Feature-map-level Online Adversarial Knowledge Distillation
Inseop Chung, SeongUk Park, Jangho Kim, Nojun Kwak

TL;DR
This paper introduces an online adversarial knowledge distillation method that transfers feature map information between networks, improving performance especially for small networks by using discriminators and cyclic learning.
Contribution
It proposes a novel adversarial training framework for online knowledge distillation that incorporates feature map knowledge transfer and a cyclic learning scheme for multiple networks.
Findings
Outperforms traditional L1-based feature alignment methods.
Significantly improves small network performance when paired with larger networks.
Effective across various network architectures on classification tasks.
Abstract
Feature maps contain rich information about image intensity and spatial correlation. However, previous online knowledge distillation methods only utilize the class probabilities. Thus in this paper, we propose an online knowledge distillation method that transfers not only the knowledge of the class probabilities but also that of the feature map using the adversarial training framework. We train multiple networks simultaneously by employing discriminators to distinguish the feature map distributions of different networks. Each network has its corresponding discriminator which discriminates the feature map from its own as fake while classifying that of the other network as real. By training a network to fool the corresponding discriminator, it can learn the other network's feature map distribution. We show that our method performs better than the conventional direct alignment method such…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · COVID-19 diagnosis using AI · Anomaly Detection Techniques and Applications
MethodsKnowledge Distillation
