Rectifying Adversarial Examples Using Their Vulnerabilities
Fumiya Morimoto, Ryuto Morita, Satoshi Ono

TL;DR
This paper introduces a method to correct adversarial examples in neural networks by re-attacking them to identify their true labels, improving robustness across different attack types without extensive parameter tuning.
Contribution
The proposed approach rectifies adversarial examples by re-attacking them to estimate their original labels, addressing both white-box and black-box attack challenges without additional training.
Findings
Consistent performance across various attack methods
Outperforms traditional rectification techniques
Effective against targeted and black-box attacks
Abstract
Deep neural network-based classifiers are prone to errors when processing adversarial examples (AEs). AEs are minimally perturbed input data undetectable to humans posing significant risks to security-dependent applications. Hence, extensive research has been undertaken to develop defense mechanisms that mitigate their threats. Most existing methods primarily focus on discriminating AEs based on the input sample features, emphasizing AE detection without addressing the correct sample categorization before an attack. While some tasks may only require mere rejection on detected AEs, others necessitate identifying the correct original input category such as traffic sign recognition in autonomous driving. The objective of this study is to propose a method for rectifying AEs to estimate the correct labels of their original inputs. Our method is based on re-attacking AEs to move them beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Malware Detection Techniques
