ReabsNet: Detecting and Revising Adversarial Examples
Jiefeng Chen, Zihang Meng, Changtian Sun, Wei Tang, Yinglun Zhu

TL;DR
ReabsNet is a novel neural network architecture that detects and revises adversarial examples to maintain high classification accuracy despite various attacks.
Contribution
It introduces a guardian network to detect adversarial samples and a revision process to recover their true labels, improving robustness over existing methods.
Findings
Outperforms state-of-the-art defense methods against multiple attacks
Effectively detects adversarial examples with high accuracy
Successfully revises adversarial samples to correct labels
Abstract
Though deep neural network has hit a huge success in recent studies and applica- tions, it still remains vulnerable to adversarial perturbations which are imperceptible to humans. To address this problem, we propose a novel network called ReabsNet to achieve high classification accuracy in the face of various attacks. The approach is to augment an existing classification network with a guardian network to detect if a sample is natural or has been adversarially perturbed. Critically, instead of simply rejecting adversarial examples, we revise them to get their true labels. We exploit the observation that a sample containing adversarial perturbations has a possibility of returning to its true class after revision. We demonstrate that our ReabsNet outperforms the state-of-the-art defense method under various adversarial attacks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Malware Detection Techniques
