Detecting adversarial attacks on random samples

Gleb Smirnov

arXiv:2408.06166·math.PR·October 28, 2024

Detecting adversarial attacks on random samples

Gleb Smirnov

PDF

Open Access

TL;DR

This paper investigates the detectability of adversarial perturbations in normal distribution samples, establishing thresholds for when such attacks can or cannot be reliably identified.

Contribution

It provides a theoretical analysis of the conditions under which adversarial attacks on normal samples are detectable or undetectable, with precise thresholds.

Findings

01

Detection thresholds depend on perturbation magnitude and sparsity.

02

When perturbations are below a certain magnitude, detection becomes impossible.

03

The study offers a clear characterization of detectability boundaries.

Abstract

This paper studies the problem of detecting adversarial perturbations in a sequence of observations. Given a data sample $X_{1}, \dots, X_{n}$ drawn from a standard normal distribution, an adversary, after observing the sample, can perturb each observation by a fixed magnitude or leave it unchanged. We explore the relationship between the perturbation magnitude, the sparsity of the perturbation, and the detectability of the adversary's actions, establishing precise thresholds for when detection becomes impossible.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProcess Optimization and Integration · Global Energy and Sustainability Research