Localized Adversarial Training for Increased Accuracy and Robustness in Image Classification
Eitan Rothberg, Tingting Chen, Luo Jie, Hao Ji

TL;DR
This paper introduces a localized adversarial training method that enhances image classifier robustness by incorporating background-altered adversarial examples, reducing accuracy loss on natural images and improving resistance to attacks.
Contribution
The paper proposes a novel localized adversarial attack and a training technique that improves classifier robustness by focusing on background perturbations.
Findings
Reduced accuracy loss on natural images.
Increased robustness against adversarial attacks.
Effective on MNIST and CIFAR-10 datasets.
Abstract
Today's state-of-the-art image classifiers fail to correctly classify carefully manipulated adversarial images. In this work, we develop a new, localized adversarial attack that generates adversarial examples by imperceptibly altering the backgrounds of normal images. We first use this attack to highlight the unnecessary sensitivity of neural networks to changes in the background of an image, then use it as part of a new training technique: localized adversarial training. By including locally adversarial images in the training set, we are able to create a classifier that suffers less loss than a non-adversarially trained counterpart model on both natural and adversarial inputs. The evaluation of our localized adversarial training algorithm on MNIST and CIFAR-10 datasets shows decreased accuracy loss on natural images, and increased robustness against adversarial inputs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Image Processing Techniques and Applications
