Prevalence of Neural Collapse during the terminal phase of deep learning   training

Vardan Papyan; X.Y. Han; David L. Donoho

arXiv:2008.08186·cs.LG·September 23, 2020

Prevalence of Neural Collapse during the terminal phase of deep learning training

Vardan Papyan, X.Y. Han, David L. Donoho

PDF

1 Repo

TL;DR

This paper investigates Neural Collapse, a phenomenon observed during the final phase of deep neural network training, revealing a symmetric geometric structure that improves generalization, robustness, and interpretability.

Contribution

It provides the first comprehensive empirical measurement of Neural Collapse across multiple architectures and datasets, elucidating its geometric properties and benefits.

Findings

01

Neural Collapse occurs consistently during the terminal phase of training.

02

Class activations and classifiers form a Simplex ETF geometry.

03

Neural Collapse enhances model generalization and robustness.

Abstract

Modern practice for training classification deepnets involves a Terminal Phase of Training (TPT), which begins at the epoch where training error first vanishes; During TPT, the training error stays effectively zero while training loss is pushed towards zero. Direct measurements of TPT, for three prototypical deepnet architectures and across seven canonical classification datasets, expose a pervasive inductive bias we call Neural Collapse, involving four deeply interconnected phenomena: (NC1) Cross-example within-class variability of last-layer training activations collapses to zero, as the individual activations themselves collapse to their class-means; (NC2) The class-means collapse to the vertices of a Simplex Equiangular Tight Frame (ETF); (NC3) Up to rescaling, the last-layer classifiers collapse to the class-means, or in other words to the Simplex ETF, i.e. to a self-dual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

neuralcollapse/neuralcollapse/blob/main/neuralcollapse.ipynb
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.