Detecting adversarial attacks on random samples
Gleb Smirnov

TL;DR
This paper investigates the detectability of adversarial perturbations in normal distribution samples, establishing thresholds for when such attacks can or cannot be reliably identified.
Contribution
It provides a theoretical analysis of the conditions under which adversarial attacks on normal samples are detectable or undetectable, with precise thresholds.
Findings
Detection thresholds depend on perturbation magnitude and sparsity.
When perturbations are below a certain magnitude, detection becomes impossible.
The study offers a clear characterization of detectability boundaries.
Abstract
This paper studies the problem of detecting adversarial perturbations in a sequence of observations. Given a data sample drawn from a standard normal distribution, an adversary, after observing the sample, can perturb each observation by a fixed magnitude or leave it unchanged. We explore the relationship between the perturbation magnitude, the sparsity of the perturbation, and the detectability of the adversary's actions, establishing precise thresholds for when detection becomes impossible.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsProcess Optimization and Integration · Global Energy and Sustainability Research
