Semantic Preserving Adversarial Attack Generation with Autoencoder and Genetic Algorithm
Xinyi Wang, Simon Yusuf Enoch, Dong Seong Kim

TL;DR
This paper introduces a novel black-box adversarial attack method that preserves data semantics by modifying latent features via autoencoders and uses genetic algorithms to generate effective perturbations, achieving high success rates with minimal noise.
Contribution
It proposes a semantic-preserving attack generation approach using autoencoders and genetic algorithms, improving attack success while maintaining data integrity.
Findings
Achieved 100% attack success rate on initial test data.
Generated less perturbation compared to FGSM.
Validated on MNIST and CIFAR-10 datasets.
Abstract
Widely used deep learning models are found to have poor robustness. Little noises can fool state-of-the-art models into making incorrect predictions. While there is a great deal of high-performance attack generation methods, most of them directly add perturbations to original data and measure them using L_p norms; this can break the major structure of data, thus, creating invalid attacks. In this paper, we propose a black-box attack, which, instead of modifying original data, modifies latent features of data extracted by an autoencoder; then, we measure noises in semantic space to protect the semantics of data. We trained autoencoders on MNIST and CIFAR-10 datasets and found optimal adversarial perturbations using a genetic algorithm. Our approach achieved a 100% attack success rate on the first 100 data of MNIST and CIFAR-10 datasets with less perturbation than FGSM.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
