TL;DR
GF-Score provides a detailed, class-wise evaluation of neural network robustness with fairness guarantees, eliminating the need for adversarial attacks and revealing class-specific vulnerabilities.
Contribution
The paper introduces GF-Score, a novel attack-free framework for decomposing and quantifying class-level robustness disparities in neural networks.
Findings
Per-class robustness scores reveal consistent vulnerability patterns.
More robust models tend to have greater class disparity.
The framework is exact and applicable to models on CIFAR-10 and ImageNet.
Abstract
Adversarial robustness is essential for deploying neural networks in safety-critical applications, yet standard evaluation methods either require expensive adversarial attacks or report only a single aggregate score that obscures how robustness is distributed across classes. We introduce the \emph{GF-Score} (GREAT-Fairness Score), a framework that decomposes the certified GREAT Score into per-class robustness profiles and quantifies their disparity through four metrics grounded in welfare economics: the Robustness Disparity Index (RDI), the Normalized Robustness Gini Coefficient (NRGC), Worst-Case Class Robustness (WCR), and a Fairness-Penalized GREAT Score (FP-GREAT). The framework further eliminates the original method's dependence on adversarial attacks through a self-calibration procedure that tunes the temperature parameter using only clean accuracy correlations. Evaluating 22…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
