A Comprehensive Evaluation Framework for Deep Model Robustness
Jun Guo, Wei Bao, Jiakai Wang, Yuqing Ma, Xinghai Gao, Gang Xiao,, Aishan Liu, Jian Dong, Xianglong Liu, Wenjun Wu

TL;DR
This paper introduces a comprehensive evaluation framework with 23 metrics for assessing deep neural network robustness against adversarial attacks, addressing limitations of previous simple metrics and promoting deeper understanding.
Contribution
It proposes a novel, multi-perspective evaluation framework with a large set of metrics and an open-source toolkit for thorough robustness assessment of deep models.
Findings
The framework effectively differentiates model robustness across datasets and defenses.
Large-scale experiments validate the framework's ability to reveal strengths and weaknesses.
Open-source platform facilitates rapid and comprehensive robustness evaluations.
Abstract
Deep neural networks (DNNs) have achieved remarkable performance across a wide range of applications, while they are vulnerable to adversarial examples, which motivates the evaluation and benchmark of model robustness. However, current evaluations usually use simple metrics to study the performance of defenses, which are far from understanding the limitation and weaknesses of these defense methods. Thus, most proposed defenses are quickly shown to be attacked successfully, which results in the ``arm race'' phenomenon between attack and defense. To mitigate this problem, we establish a model robustness evaluation framework containing 23 comprehensive and rigorous metrics, which consider two key perspectives of adversarial learning (i.e., data and model). Through neuron coverage and data imperceptibility, we use data-oriented metrics to measure the integrity of test examples; by delving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
