AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learning
Hong Wang, Yuefan Deng, Shinjae Yoo, Haibin Ling, Yuewei Lin

TL;DR
This paper introduces AGKD-BML, a novel adversarial training method combining attention-guided knowledge distillation and bi-directional metric learning to enhance neural network robustness against adversarial attacks.
Contribution
It proposes a new training framework that transfers attention knowledge from a fixed model and uses bidirectional metric learning to improve adversarial robustness.
Findings
Outperforms state-of-the-art adversarial defense methods.
Effective in focusing on correct regions during adversarial training.
Demonstrates robustness across multiple datasets and attack types.
Abstract
While deep neural networks have shown impressive performance in many tasks, they are fragile to carefully designed adversarial attacks. We propose a novel adversarial training-based model by Attention Guided Knowledge Distillation and Bi-directional Metric Learning (AGKD-BML). The attention knowledge is obtained from a weight-fixed model trained on a clean dataset, referred to as a teacher model, and transferred to a model that is under training on adversarial examples (AEs), referred to as a student model. In this way, the student model is able to focus on the correct region, as well as correcting the intermediate features corrupted by AEs to eventually improve the model accuracy. Moreover, to efficiently regularize the representation in feature space, we propose a bidirectional metric learning. Specifically, given a clean image, it is first attacked to its most confusing class to get…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
MethodsKnowledge Distillation · Autoencoders · Triplet Loss
