Sample Efficient Detection and Classification of Adversarial Attacks via Self-Supervised Embeddings
Mazda Moayeri, Soheil Feizi

TL;DR
This paper introduces SimCat, a self-supervised embedding-based method for efficiently detecting and classifying various adversarial attacks and data poisonings in deep models, demonstrating high accuracy with minimal data.
Contribution
The paper presents a novel, simple linear classifier approach using SimCLR embeddings for broad and efficient adversarial attack detection and classification, including unseen threat models.
Findings
Over 85% detection accuracy on SVHN with minimal data
Classifies 8 attack types on ImageNet with over 40% accuracy
Halves poisoning success rate on STL10 with limited poisons
Abstract
Adversarial robustness of deep models is pivotal in ensuring safe deployment in real world settings, but most modern defenses have narrow scope and expensive costs. In this paper, we propose a self-supervised method to detect adversarial attacks and classify them to their respective threat models, based on a linear model operating on the embeddings from a pre-trained self-supervised encoder. We use a SimCLR encoder in our experiments, since we show the SimCLR embedding distance is a good proxy for human perceptibility, enabling it to encapsulate many threat models at once. We call our method SimCat since it uses SimCLR encoder to catch and categorize various types of adversarial attacks, including L_p and non-L_p evasion attacks, as well as data poisonings. The simple nature of a linear classifier makes our method efficient in both time and sample complexity. For example, on SVHN, using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
MethodsBitcoin Customer Service Number +1-833-534-1729 · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Average Pooling · Residual Connection · Residual Block · Kaiming Initialization · Convolution · Batch Normalization · Global Average Pooling
