On the Perils of Cascading Robust Classifiers
Ravi Mangal, Zifan Wang, Chi Zhang, Klas Leino, Corina Pasareanu and, Matt Fredrikson

TL;DR
This paper uncovers a fundamental flaw in cascading robust classifiers, showing that their certifiable robustness claims can be invalidated by adversarial attacks, thus questioning their reliability in safety-critical applications.
Contribution
The paper demonstrates that cascading ensembles' robustness certifiers are unsound and introduces CasA, an attack method exposing their vulnerabilities.
Findings
Cascading ensembles can be falsely certified as robust.
CasA attack reduces ensemble accuracy to as low as 11%.
Up to 88% of claimed robust samples are vulnerable to adversarial inputs.
Abstract
Ensembling certifiably robust neural networks is a promising approach for improving the \emph{certified robust accuracy} of neural models. Black-box ensembles that assume only query-access to the constituent models (and their robustness certifiers) during prediction are particularly attractive due to their modular structure. Cascading ensembles are a popular instance of black-box ensembles that appear to improve certified robust accuracies in practice. However, we show that the robustness certifier used by a cascading ensemble is unsound. That is, when a cascading ensemble is certified as locally robust at an input (with respect to ), there can be inputs in the -ball centered at , such that the cascade's prediction at is different from and thus the ensemble is not locally robust. Our theoretical findings are accompanied by empirical results that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Anomaly Detection Techniques and Applications
