On the reversibility of adversarial attacks
Chau Yi Li, Ricardo S\'anchez-Matilla, Ali Shahin Shamsabadi, Riccardo, Mazzon, Andrea Cavallaro

TL;DR
This paper explores whether it is possible to reverse adversarial attacks on neural network classifiers by analyzing the predictability of class mappings between original and adversarial images, and proposes an approach to recover original predictions.
Contribution
It introduces the concept of reversibility of adversarial attacks, quantifies it, and analyzes its factors on state-of-the-art attacks and classifiers.
Findings
Reversibility varies across different attacks and classifiers.
A method to reverse adversarial effects using prior classification results.
Factors influencing reversibility include attack strength and classifier robustness.
Abstract
Adversarial attacks modify images with perturbations that change the prediction of classifiers. These modified images, known as adversarial examples, expose the vulnerabilities of deep neural network classifiers. In this paper, we investigate the predictability of the mapping between the classes predicted for original images and for their corresponding adversarial examples. This predictability relates to the possibility of retrieving the original predictions and hence reversing the induced misclassification. We refer to this property as the reversibility of an adversarial attack, and quantify reversibility as the accuracy in retrieving the original class or the true class of an adversarial example. We present an approach that reverses the effect of an adversarial attack on a classifier using a prior set of classification results. We analyse the reversibility of state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · COVID-19 diagnosis using AI
