Local Linearity and Double Descent in Catastrophic Overfitting

Varun Sivashankar; Nikil Selvam

arXiv:2111.10754·cs.LG·November 23, 2021

Local Linearity and Double Descent in Catastrophic Overfitting

Varun Sivashankar, Nikil Selvam

PDF

1 Repo

TL;DR

This paper investigates the role of local linearity and orthogonality in preventing catastrophic overfitting during adversarial training, and uncovers the double descent phenomenon in this context.

Contribution

It demonstrates that high local linearity is sufficient but not necessary to prevent overfitting, introduces a regularization to enforce weight orthogonality, and identifies double descent in adversarial training.

Findings

01

High local linearity can prevent catastrophic overfitting.

02

Orthogonal weight regularization relates to local linearity.

03

Double descent occurs during adversarial training.

Abstract

Catastrophic overfitting is a phenomenon observed during Adversarial Training (AT) with the Fast Gradient Sign Method (FGSM) where the test robustness steeply declines over just one epoch in the training stage. Prior work has attributed this loss in robustness to a sharp decrease in $local linearity$ of the neural network with respect to the input space, and has demonstrated that introducing a local linearity measure as a regularization term prevents catastrophic overfitting. Using a simple neural network architecture, we experimentally demonstrate that maintaining high local linearity might be $sufficient$ to prevent catastrophic overfitting but is not $necessary.$ Further, inspired by Parseval networks, we introduce a regularization term to AT with FGSM to make the weight matrices of the network orthogonal and study the connection between orthogonality of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nikilrselvam/linearity-orthogonality-dd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.