Soft Adversarial Training Can Retain Natural Accuracy

Abhijith Sharma; Apurva Narayan

arXiv:2206.01904·cs.LG·June 7, 2022

Soft Adversarial Training Can Retain Natural Accuracy

Abhijith Sharma, Apurva Narayan

PDF

TL;DR

This paper introduces a 'soft' adversarial training framework that maintains natural accuracy while providing robustness against adversarial attacks, especially suited for moderately critical applications.

Contribution

It presents a novel training approach using abstract certification to balance robustness and accuracy without sacrificing natural performance.

Findings

01

Retains natural accuracy with adversarial robustness

02

Effective for moderately critical applications

03

Outperforms traditional adversarial training in certain settings

Abstract

Adversarial training for neural networks has been in the limelight in recent years. The advancement in neural network architectures over the last decade has led to significant improvement in their performance. It sparked an interest in their deployment for real-time applications. This process initiated the need to understand the vulnerability of these models to adversarial attacks. It is instrumental in designing models that are robust against adversaries. Recent works have proposed novel techniques to counter the adversaries, most often sacrificing natural accuracy. Most suggest training with an adversarial version of the inputs, constantly moving away from the original distribution. The focus of our work is to use abstract certification to extract a subset of inputs for (hence we call it 'soft') adversarial training. We propose a training framework that can retain natural accuracy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.