Towards an Intrinsic Definition of Robustness for a Classifier
Th\'eo Giraudon, Vincent Gripon, Matthias L\"owe, Franck Vermet

TL;DR
This paper proposes a new intrinsic measure of classifier robustness that accounts for sample difficulty, providing a more reliable evaluation method than traditional averaging approaches, with theoretical and empirical validation across models.
Contribution
It introduces a sample difficulty-weighted robustness score, demonstrating its independence from sample choice through theoretical analysis and empirical experiments.
Findings
The proposed score is independent of sample selection in logistic regression.
It effectively measures robustness in deep neural networks.
The score offers a more reliable robustness assessment than average radius methods.
Abstract
The robustness of classifiers has become a question of paramount importance in the past few years. Indeed, it has been shown that state-of-the-art deep learning architectures can easily be fooled with imperceptible changes to their inputs. Therefore, finding good measures of robustness of a trained classifier is a key issue in the field. In this paper, we point out that averaging the radius of robustness of samples in a validation set is a statistically weak measure. We propose instead to weight the importance of samples depending on their difficulty. We motivate the proposed score by a theoretical case study using logistic regression, where we show that the proposed score is independent of the choice of the samples it is evaluated upon. We also empirically demonstrate the ability of the proposed score to measure robustness of classifiers with little dependence on the choice of samples…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
