TL;DR
This paper introduces an evolutionary method using genetic algorithms to automatically find optimal sequences of image processing techniques for detecting adversarial attacks on machine learning models, demonstrating promising results across multiple datasets.
Contribution
It proposes a novel evolutionary approach to automatically determine effective image processing sequences for adversarial attack detection, adaptable to various attack types and datasets.
Findings
The method successfully detects adversarial samples across multiple datasets.
Different IPTS are generated for each attack type and dataset.
The approach shows promising efficiency as a preprocessing step for AI models.
Abstract
Developing secure machine learning models from adversarial examples is challenging as various methods are continually being developed to generate adversarial attacks. In this work, we propose an evolutionary approach to automatically determine Image Processing Techniques Sequence (IPTS) for detecting malicious inputs. Accordingly, we first used a diverse set of attack methods including adaptive attack methods (on our defense) to generate adversarial samples from the clean dataset. A detection framework based on a genetic algorithm (GA) is developed to find the optimal IPTS, where the optimality is estimated by different fitness measures such as Euclidean distance, entropy loss, average histogram, local binary pattern and loss functions. The "image difference" between the original and processed images is used to extract the features, which are then fed to a classification scheme in order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
