Detecting Brittle Decisions for Free: Leveraging Margin Consistency in   Deep Robust Classifiers

Jonas Ngnaw\'e; Sabyasachi Sahoo; Yann Pequignot; Fr\'ed\'eric; Precioso; Christian Gagn\'e

arXiv:2406.18451·cs.LG·November 4, 2024

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers

Jonas Ngnaw\'e, Sabyasachi Sahoo, Yann Pequignot, Fr\'ed\'eric, Precioso, Christian Gagn\'e

PDF

1 Repo 1 Video

TL;DR

This paper introduces margin consistency as a way to efficiently detect vulnerable, non-robust samples in deep classifiers by linking input space margins with logit margins, enabling real-time vulnerability assessment.

Contribution

It establishes the theoretical link between input space and logit margins, and demonstrates how to use this for efficient detection of brittle decisions in robust models.

Findings

01

High correlation between input space margins and logit margins in robust models

02

Logit margin can effectively detect non-robust, brittle decisions

03

Pseudo-margin learning improves detection when margin consistency is low

Abstract

Despite extensive research on adversarial training strategies to improve robustness, the decisions of even the most robust deep learning models can still be quite sensitive to imperceptible perturbations, creating serious risks when deploying them for high-stakes real-world applications. While detecting such cases may be critical, evaluating a model's vulnerability at a per-instance level using adversarial attacks is computationally too intensive and unsuitable for real-time deployment scenarios. The input space margin is the exact score to detect non-robust samples and is intractable for deep neural networks. This paper introduces the concept of margin consistency -- a property that links the input space margins and the logit margins in robust models -- for efficient detection of vulnerable samples. First, we establish that margin consistency is a necessary and sufficient condition to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ngnawejonas/margin-consistency
pytorchOfficial

Videos

Detecting Brittle Decisions for Free: Leveraging Margin Consistency in Deep Robust Classifiers· slideslive

Taxonomy

MethodsSparse Evolutionary Training