Catastrophic Overfitting, Entropy Gap and Participation Ratio: A Noiseless $l^p$ Norm Solution for Fast Adversarial Training

Fares B. Mehouachi; Saif Eddin Jabari

arXiv:2505.02360·cs.LG·May 19, 2026

Catastrophic Overfitting, Entropy Gap and Participation Ratio: A Noiseless $l^p$ Norm Solution for Fast Adversarial Training

Fares B. Mehouachi, Saif Eddin Jabari

PDF

TL;DR

This paper introduces a norm-based approach to prevent catastrophic overfitting in fast adversarial training by adaptively controlling the $l^p$ training norm, leading to improved robustness without extra regularization.

Contribution

It proposes a novel $l^p$ norm control framework and adaptive $l^p$-FGSM attacks to mitigate catastrophic overfitting in adversarial training.

Findings

01

Adaptive $l^p$-FGSM improves robustness against multi-step attacks.

02

Gradient concentration metrics predict and prevent overfitting.

03

The method achieves robustness without additional regularization or noise.

Abstract

Adversarial training is a cornerstone of robust deep learning, but fast methods like the Fast Gradient Sign Method (FGSM) often suffer from Catastrophic Overfitting (CO), where models become robust to single-step attacks but fail against multi-step variants. While existing solutions rely on noise injection, regularization, or gradient clipping, we propose a novel solution that purely controls the $l^{p}$ training norm to mitigate CO. Our study is motivated by the empirical observation that CO is more prevalent under the $l^{\infty}$ norm than the $l^{2}$ norm. Leveraging this insight, we develop a framework for generalized $l^{p}$ attack as a fixed point problem and craft $l^{p}$ -FGSM attacks to understand the transition mechanics from $l^{2}$ to $l^{\infty}$ . This leads to our core insight: CO emerges when highly concentrated gradients where information localizes in few dimensions interact with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Industrial Vision Systems and Defect Detection · Integrated Circuits and Semiconductor Failure Analysis