Provably Safeguarding a Classifier from OOD and Adversarial Samples: an Extreme Value Theory Approach
Nicolas Atienza, Christophe Labreuche, Johanne Cohen, Michele Sebag

TL;DR
This paper presents SPADE, a novel method using Extreme Value Theory to transform classifiers into abstaining models that provably reject out-of-distribution and adversarial samples, enhancing robustness.
Contribution
Introduces SPADE, a probabilistic abstaining classifier leveraging GEV models for formal OOD and adversarial sample detection with provable guarantees.
Findings
Effective across multiple neural architectures.
Demonstrates robustness against OOD and adversarial samples.
Outperforms existing methods in efficiency and stability.
Abstract
This paper introduces a novel method, Sample-efficient Probabilistic Detection using Extreme Value Theory (SPADE), which transforms a classifier into an abstaining classifier, offering provable protection against out-of-distribution and adversarial samples. The approach is based on a Generalized Extreme Value (GEV) model of the training distribution in the classifier's latent space, enabling the formal characterization of OOD samples. Interestingly, under mild assumptions, the GEV model also allows for formally characterizing adversarial samples. The abstaining classifier, which rejects samples based on their assessment by the GEV model, provably avoids OOD and adversarial samples. The empirical validation of the approach, conducted on various neural architectures (ResNet, VGG, and Vision Transformer) and medium and large-sized datasets (CIFAR-10, CIFAR-100, and ImageNet), demonstrates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Bacillus and Francisella bacterial research
MethodsMax Pooling · Dense Connections · Convolution · Dropout · Softmax
