Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples   Regularization

Runqi Lin; Chaojian Yu; Tongliang Liu

arXiv:2404.08154·cs.LG·September 17, 2024·5 cites

Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization

Runqi Lin, Chaojian Yu, Tongliang Liu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper identifies abnormal adversarial examples as a key factor in catastrophic overfitting during single-step adversarial training and proposes a regularization method to prevent their formation, thereby enhancing robustness.

Contribution

The paper introduces AAER, a novel regularization technique that explicitly controls abnormal adversarial examples to prevent catastrophic overfitting in SSAT.

Findings

01

AAER effectively eliminates catastrophic overfitting.

02

The method improves adversarial robustness with minimal computational cost.

03

Experiments validate the correlation between AAEs and classifier distortion.

Abstract

Single-step adversarial training (SSAT) has demonstrated the potential to achieve both efficiency and robustness. However, SSAT suffers from catastrophic overfitting (CO), a phenomenon that leads to a severely distorted classifier, making it vulnerable to multi-step adversarial attacks. In this work, we observe that some adversarial examples generated on the SSAT-trained network exhibit anomalous behaviour, that is, although these training samples are generated by the inner maximization process, their associated loss decreases instead, which we named abnormal adversarial examples (AAEs). Upon further analysis, we discover a close relationship between AAEs and classifier distortion, as both the number and outputs of AAEs undergo a significant variation with the onset of CO. Given this observation, we re-examine the SSAT process and uncover that before the occurrence of CO, the classifier…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tmllab/2023_neurips_aaer
pytorchOfficial

Videos

Eliminating Catastrophic Overfitting Via Abnormal Adversarial Examples Regularization· slideslive

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection