Adversarial Machine Learning: Attacking and Safeguarding Image Datasets
Koushik Chowdhury

TL;DR
This paper investigates the vulnerabilities of CNNs to adversarial attacks on image datasets and evaluates a retraining defense method to improve robustness, highlighting ongoing challenges in safeguarding models.
Contribution
It introduces an adversarial training approach to enhance CNN robustness against FGSM attacks across multiple image datasets.
Findings
Adversarial training improves model robustness but does not eliminate all vulnerabilities.
Models trained on clear images show significant accuracy drops under attack.
The effectiveness of defenses varies across datasets and attack intensities.
Abstract
This paper examines the vulnerabilities of convolutional neural networks (CNNs) to adversarial attacks and explores a method for their safeguarding. In this study, CNNs were implemented on four of the most common image datasets, namely CIFAR-10, ImageNet, MNIST, and Fashion-MNIST, and achieved high baseline accuracy. To assess the strength of these models, the Fast Gradient Sign Method was used, which is a type of exploit on the model that is used to bring down the models accuracies by adding a very minimal perturbation to the input image. To counter the FGSM attack, a safeguarding approach went through, which includes retraining the models on clear and pollutant or adversarial images to increase their resistance ability. The next step involves applying FGSM again, but this time to the adversarially trained models, to see how much the accuracy of the models has gone down and evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
