Detecting and Recovering Adversarial Examples from Extracting Non-robust and Highly Predictive Adversarial Perturbations
Mingyu Dong, Jiahao Chen, Diqun Yan, Jingxing Gao, Li Dong, and Rangding Wang

TL;DR
This paper introduces a model-free method for detecting and recovering adversarial examples in deep neural networks by extracting high-dimensional perturbations, achieving high detection accuracy without querying the victim model.
Contribution
The proposed approach uniquely extracts high-dimensional adversarial perturbations for detection and recovery, bypassing the need for model queries and enhancing security against adversarial attacks.
Findings
High detection accuracy for adversarial examples.
Ability to identify specific categories of AEs.
Effective recovery of AEs to normal examples.
Abstract
Deep neural networks (DNNs) have been shown to be vulnerable against adversarial examples (AEs) which are maliciously designed to fool target models. The normal examples (NEs) added with imperceptible adversarial perturbation, can be a security threat to DNNs. Although the existing AEs detection methods have achieved a high accuracy, they failed to exploit the information of the AEs detected. Thus, based on high-dimension perturbation extraction, we propose a model-free AEs detection method, the whole process of which is free from querying the victim model. Research shows that DNNs are sensitive to the high-dimension features. The adversarial perturbation hiding in the adversarial example belongs to the high-dimension feature which is highly predictive and non-robust. DNNs learn more details from high-dimension data than others. In our method, the perturbation extractor can extract the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
MethodsAutoencoders
