Exploiting the Sensitivity of $L_2$ Adversarial Examples to Erase-and-Restore
Fei Zuo, Qiang Zeng

TL;DR
This paper introduces Erase-and-Restore, a novel detection method exploiting the sensitivity of $L_2$ adversarial examples to pixel erasure and inpainting, achieving over 98% detection accuracy on CIFAR-10 and ImageNet.
Contribution
The paper proposes a new AE detection technique that leverages the sensitivity of $L_2$ adversarial examples to pixel erasure and restoration, effective against adaptive attacks.
Findings
Detects over 98% of $L_2$ AEs on CIFAR-10 and ImageNet.
High transferability of the detection system across different $L_2$ attack methods.
Demonstrates robustness against adaptive $L_2$ adversarial attacks.
Abstract
By adding carefully crafted perturbations to input images, adversarial examples (AEs) can be generated to mislead neural-network-based image classifiers. adversarial perturbations by Carlini and Wagner (CW) are among the most effective but difficult-to-detect attacks. While many countermeasures against AEs have been proposed, detection of adaptive CW- AEs is still an open question. We find that, by randomly erasing some pixels in an AE and then restoring it with an inpainting technique, the AE, before and after the steps, tends to have different classification results, while a benign sample does not show this symptom. We thus propose a novel AE detection technique, Erase-and-Restore (E&R), that exploits the intriguing sensitivity of attacks. Experiments conducted on two popular image datasets, CIFAR-10 and ImageNet, show that the proposed technique is able to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research · Anomaly Detection Techniques and Applications
MethodsAutoencoders
