ARDIR: Improving Robustness using Knowledge Distillation of Internal Representation
Tomokatsu Takahashi, Masanori Yamada, Yuuki Yamanaka, Tomoya Yamashita

TL;DR
ARDIR enhances adversarial training by distilling internal representations from teacher models, leading to more robust student models against adversarial attacks, outperforming previous methods.
Contribution
The paper introduces ARDIR, a novel method that leverages internal representations in knowledge distillation to improve adversarial robustness.
Findings
ARDIR outperforms previous adversarial training methods.
Using internal representations provides richer training signals.
Student models trained with ARDIR show increased robustness.
Abstract
Adversarial training is the most promising method for learning robust models against adversarial examples. A recent study has shown that knowledge distillation between the same architectures is effective in improving the performance of adversarial training. Exploiting knowledge distillation is a new approach to improve adversarial training and has attracted much attention. However, its performance is still insufficient. Therefore, we propose Adversarial Robust Distillation with Internal Representation~(ARDIR) to utilize knowledge distillation even more effectively. In addition to the output of the teacher model, ARDIR uses the internal representation of the teacher model as a label for adversarial training. This enables the student model to be trained with richer, more informative labels. As a result, ARDIR can learn more robust student models. We show that ARDIR outperforms previous…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
MethodsKnowledge Distillation
