TL;DR
This paper introduces the Layer-Peeled Model to analyze deep neural networks, revealing phenomena like neural collapse in balanced data and a new issue called Minority Collapse in imbalanced training, with insights into mitigation.
Contribution
The paper proposes the Layer-Peeled Model as an analytically tractable tool to understand neural network training dynamics and uncovers the novel Minority Collapse phenomenon in imbalanced datasets.
Findings
Neural collapse corresponds to simplex equiangular tight frames in balanced data.
Minority Collapse limits performance on minority classes in imbalanced training.
Layer-Peeled Model predicts phenomena later confirmed by experiments.
Abstract
In this paper, we introduce the \textit{Layer-Peeled Model}, a nonconvex yet analytically tractable optimization program, in a quest to better understand deep neural networks that are trained for a sufficiently long time. As the name suggests, this new model is derived by isolating the topmost layer from the remainder of the neural network, followed by imposing certain constraints separately on the two parts of the network. We demonstrate that the Layer-Peeled Model, albeit simple, inherits many characteristics of well-trained neural networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep learning training. First, when working on class-balanced datasets, we prove that any solution to this model forms a simplex equiangular tight frame, which in part explains the recently discovered phenomenon of neural collapse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
