Towards Precise Observations of Neural Model Robustness in   Classification

Wenchuan Mu; Kwan Hui Lim

arXiv:2404.16457·cs.SE·April 26, 2024

Towards Precise Observations of Neural Model Robustness in Classification

Wenchuan Mu, Kwan Hui Lim

PDF

TL;DR

This paper introduces a practical hypothesis testing-based metric for assessing neural model robustness, aiming to improve safety evaluations in critical applications by providing more precise and cost-effective measures.

Contribution

It proposes a new, straightforward robustness metric using hypothesis testing, integrated into TorchAttacks, and compares it with existing methods for better safety assessment.

Findings

01

The proposed metric offers a more precise robustness evaluation.

02

It is cost-effective and easy to implement.

03

The approach enhances understanding of model robustness in safety-critical scenarios.

Abstract

In deep learning applications, robustness measures the ability of neural models that handle slight changes in input data, which could lead to potential safety hazards, especially in safety-critical applications. Pre-deployment assessment of model robustness is essential, but existing methods often suffer from either high costs or imprecise results. To enhance safety in real-world scenarios, metrics that effectively capture the model's robustness are needed. To address this issue, we compare the rigour and usage conditions of various assessment methods based on different definitions. Then, we propose a straightforward and practical metric utilizing hypothesis testing for probabilistic robustness and have integrated it into the TorchAttacks library. Through a comparative analysis of diverse robustness assessment methods, our approach contributes to a deeper understanding of model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.