On the Suitability of $L_p$-norms for Creating and Preventing Adversarial Examples
Mahmood Sharif, Lujo Bauer, and Michael K. Reiter

TL;DR
This paper critically examines the use of $L_p$-norms in adversarial example research, showing they are unreliable for measuring perceptual similarity and proposing alternative metrics for future work.
Contribution
The study demonstrates that $L_p$-norms are both insufficient and unnecessary for assessing perceptual similarity in adversarial examples, supported by user studies on CIFAR10 and MNIST.
Findings
$L_p$-norms often do not align with human perception of similarity.
Adversarial examples can be perceptually different despite low $L_p$-norm distances.
Current thresholds based on $L_p$-norms are unreliable for defending against adversarial attacks.
Abstract
Much research effort has been devoted to better understanding adversarial examples, which are specially crafted inputs to machine-learning models that are perceptually similar to benign inputs, but are classified differently (i.e., misclassified). Both algorithms that create adversarial examples and strategies for defending against them typically use -norms to measure the perceptual similarity between an adversarial input and its benign original. Prior work has already shown, however, that two images need not be close to each other as measured by an -norm to be perceptually similar. In this work, we show that nearness according to an -norm is not just unnecessary for perceptual similarity, but is also insufficient. Specifically, focusing on datasets (CIFAR10 and MNIST), -norms, and thresholds used in prior work, we show through online user studies that "adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
