Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Satoshi Suzuki, Shin'ya Yamaguchi, Shoichiro Takeda, Sekitoshi Kanai,, Naoki Makishima, Atsushi Ando, Ryo Masumura

TL;DR
This paper introduces ARREST, a novel adversarial training method that combines finetuning, representation-guided distillation, and noisy replay to improve both accuracy and robustness of neural networks against adversarial attacks.
Contribution
ARREST is a new adversarial training approach that effectively mitigates the accuracy-robustness tradeoff by preserving latent representations during finetuning.
Findings
ARREST outperforms previous methods in balancing accuracy and robustness.
It maintains higher standard accuracy on clean data.
It enhances robustness against adversarial examples.
Abstract
This paper addresses the tradeoff between standard accuracy on clean examples and robustness against adversarial examples in deep neural networks (DNNs). Although adversarial training (AT) improves robustness, it degrades the standard accuracy, thus yielding the tradeoff. To mitigate this tradeoff, we propose a novel AT method called ARREST, which comprises three components: (i) adversarial finetuning (AFT), (ii) representation-guided knowledge distillation (RGKD), and (iii) noisy replay (NR). AFT trains a DNN on adversarial examples by initializing its parameters with a DNN that is standardly pretrained on clean examples. RGKD and NR respectively entail a regularization term and an algorithm to preserve latent representations of clean examples during AFT. RGKD penalizes the distance between the representations of the standardly pretrained and AFT DNNs. NR switches input adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · COVID-19 diagnosis using AI · Machine Learning and Data Classification
MethodsKnowledge Distillation
