Adversarial Finetuning with Latent Representation Constraint to Mitigate   Accuracy-Robustness Tradeoff

Satoshi Suzuki; Shin'ya Yamaguchi; Shoichiro Takeda; Sekitoshi Kanai,; Naoki Makishima; Atsushi Ando; Ryo Masumura

arXiv:2308.16454·cs.CV·September 1, 2023

Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff

Satoshi Suzuki, Shin'ya Yamaguchi, Shoichiro Takeda, Sekitoshi Kanai,, Naoki Makishima, Atsushi Ando, Ryo Masumura

PDF

Open Access

TL;DR

This paper introduces ARREST, a novel adversarial training method that combines finetuning, representation-guided distillation, and noisy replay to improve both accuracy and robustness of neural networks against adversarial attacks.

Contribution

ARREST is a new adversarial training approach that effectively mitigates the accuracy-robustness tradeoff by preserving latent representations during finetuning.

Findings

01

ARREST outperforms previous methods in balancing accuracy and robustness.

02

It maintains higher standard accuracy on clean data.

03

It enhances robustness against adversarial examples.

Abstract

This paper addresses the tradeoff between standard accuracy on clean examples and robustness against adversarial examples in deep neural networks (DNNs). Although adversarial training (AT) improves robustness, it degrades the standard accuracy, thus yielding the tradeoff. To mitigate this tradeoff, we propose a novel AT method called ARREST, which comprises three components: (i) adversarial finetuning (AFT), (ii) representation-guided knowledge distillation (RGKD), and (iii) noisy replay (NR). AFT trains a DNN on adversarial examples by initializing its parameters with a DNN that is standardly pretrained on clean examples. RGKD and NR respectively entail a regularization term and an algorithm to preserve latent representations of clean examples during AFT. RGKD penalizes the distance between the representations of the standardly pretrained and AFT DNNs. NR switches input adversarial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · COVID-19 diagnosis using AI · Machine Learning and Data Classification

MethodsKnowledge Distillation