Stratified Adversarial Robustness with Rejection
Jiefeng Chen, Jayaram Raghuram, Jihye Choi, Xi Wu, Yingyu Liang,, Somesh Jha

TL;DR
This paper introduces a stratified rejection framework for adversarial robustness, proposing a new defense method called CPR that improves selective classification performance against strong attacks.
Contribution
It presents a theoretical analysis of stratified rejection and introduces CPR, a novel adversarial training method for robust selective classification with rejection.
Findings
CPR significantly reduces robust loss on CIFAR-10.
CPR outperforms existing methods under adaptive attacks.
Stratified rejection effectively balances rejection costs and robustness.
Abstract
Recently, there is an emerging interest in adversarially training a classifier with a rejection option (also known as a selective classifier) for boosting adversarial robustness. While rejection can incur a cost in many applications, existing studies typically associate zero cost with rejecting perturbed inputs, which can result in the rejection of numerous slightly-perturbed inputs that could be correctly classified. In this work, we study adversarially-robust classification with rejection in the stratified rejection setting, where the rejection cost is modeled by rejection loss functions monotonically non-increasing in the perturbation magnitude. We theoretically analyze the stratified rejection setting and propose a novel defense method -- Adversarial Training with Consistent Prediction-based Rejection (CPR) -- for building a robust selective classifier. Experiments on image datasets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Bacillus and Francisella bacterial research
