AGKD-BML: Defense Against Adversarial Attack by Attention Guided   Knowledge Distillation and Bi-directional Metric Learning

Hong Wang; Yuefan Deng; Shinjae Yoo; Haibin Ling; Yuewei Lin

arXiv:2108.06017·cs.CV·August 16, 2021

AGKD-BML: Defense Against Adversarial Attack by Attention Guided Knowledge Distillation and Bi-directional Metric Learning

Hong Wang, Yuefan Deng, Shinjae Yoo, Haibin Ling, Yuewei Lin

PDF

Open Access 1 Repo

TL;DR

This paper introduces AGKD-BML, a novel adversarial training method combining attention-guided knowledge distillation and bi-directional metric learning to enhance neural network robustness against adversarial attacks.

Contribution

It proposes a new training framework that transfers attention knowledge from a fixed model and uses bidirectional metric learning to improve adversarial robustness.

Findings

01

Outperforms state-of-the-art adversarial defense methods.

02

Effective in focusing on correct regions during adversarial training.

03

Demonstrates robustness across multiple datasets and attack types.

Abstract

While deep neural networks have shown impressive performance in many tasks, they are fragile to carefully designed adversarial attacks. We propose a novel adversarial training-based model by Attention Guided Knowledge Distillation and Bi-directional Metric Learning (AGKD-BML). The attention knowledge is obtained from a weight-fixed model trained on a clean dataset, referred to as a teacher model, and transferred to a model that is under training on adversarial examples (AEs), referred to as a student model. In this way, the student model is able to focus on the correct region, as well as correcting the intermediate features corrupted by AEs to eventually improve the model accuracy. Moreover, to efficiently regularize the representation in feature space, we propose a bidirectional metric learning. Specifically, given a clean image, it is first attacked to its most confusing class to get…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hongw579/agkd-bml
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsKnowledge Distillation · Autoencoders · Triplet Loss