Robust Adversarial Classification via Abstaining
Abed AlRahman Al Makdah, Vaibhav Katewa, Fabio Pasqualetti

TL;DR
This paper introduces an abstain option in binary classifiers to enhance adversarial robustness, revealing a fundamental tradeoff between robustness and nominal performance, validated through theoretical analysis and experiments on MNIST.
Contribution
It formulates a framework for classifiers with abstain options to improve adversarial robustness and characterizes the tradeoff with nominal accuracy, including necessary conditions for abstain region design.
Findings
Existence of a tradeoff between robustness and nominal performance.
Theoretical conditions for abstain region design in 1D classification.
Empirical validation on MNIST demonstrating the tradeoff in multi-class settings.
Abstract
In this work, we consider a binary classification problem and cast it into a binary hypothesis testing framework, where the observations can be perturbed by an adversary. To improve the adversarial robustness of a classifier, we include an abstain option, where the classifier abstains from making a decision when it has low confidence about the prediction. We propose metrics to quantify the nominal performance of a classifier with an abstain option and its robustness against adversarial perturbations. We show that there exist a tradeoff between the two metrics regardless of what method is used to choose the abstain region. Our results imply that the robustness of a classifier with an abstain option can only be improved at the expense of its nominal performance. Further, we provide necessary conditions to design the abstain region for a 1- dimensional binary classification problem. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Machine Learning and Algorithms
