A general metric for identifying adversarial images
Siddharth Krishna Kumar

TL;DR
This paper introduces a universal metric based on spectral analysis of images that reliably detects adversarial images across various datasets and attack strategies without needing recalibration.
Contribution
The study proposes a novel spectral-based metric that generalizes adversarial image detection to unknown attack strategies, overcoming previous limitations.
Findings
Effective detection across multiple datasets
No need for recalibration with changing attack strategies
Provides geometric insights into adversarial perturbations
Abstract
It is well known that a determined adversary can fool a neural network by making imperceptible adversarial perturbations to an image. Recent studies have shown that these perturbations can be detected even without information about the neural network if the strategy taken by the adversary is known beforehand. Unfortunately, these studies suffer from the generalization limitation -- the detection method has to be recalibrated every time the adversary changes his strategy. In this study, we attempt to overcome the generalization limitation by deriving a metric which reliably identifies adversarial images even when the approach taken by the adversary is unknown. Our metric leverages key differences between the spectra of clean and adversarial images when an image is treated as a matrix. Our metric is able to detect adversarial images across different datasets and attack strategies without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Digital Media Forensic Detection · Bacillus and Francisella bacterial research
