AdvFoolGen: Creating Persistent Troubles for Deep Classifiers
Yuzhen Ding, Nupur Thakur, Baoxin Li

TL;DR
AdvFoolGen introduces a new black-box attack method that generates realistic adversarial images within the natural image feature space, effectively bypassing current defenses and revealing vulnerabilities in deep neural networks.
Contribution
It presents a novel attack approach, AdvFoolGen, capable of fooling deep classifiers even with advanced defenses, advancing understanding of neural network vulnerabilities.
Findings
AdvFoolGen outperforms existing attack algorithms in robustness.
The attack remains effective against state-of-the-art defenses.
Analysis explains why AdvFoolGen successfully bypasses defenses.
Abstract
Researches have shown that deep neural networks are vulnerable to malicious attacks, where adversarial images are created to trick a network into misclassification even if the images may give rise to totally different labels by human eyes. To make deep networks more robust to such attacks, many defense mechanisms have been proposed in the literature, some of which are quite effective for guarding against typical attacks. In this paper, we present a new black-box attack termed AdvFoolGen, which can generate attacking images from the same feature space as that of the natural images, so as to keep baffling the network even though state-of-the-art defense mechanisms have been applied. We systematically evaluate our model by comparing with well-established attack algorithms. Through experiments, we demonstrate the effectiveness and robustness of our attack in the face of state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
