Beyond Unconstrained Features: Neural Collapse for Shallow Neural Networks with General Data
Wanli Hong, Shuyang Ling

TL;DR
This paper investigates neural collapse in shallow ReLU neural networks, revealing how data properties and network architecture influence the phenomenon and its relation to generalization, extending theoretical understanding beyond unconstrained feature models.
Contribution
It provides a complete characterization of neural collapse occurrence in shallow networks, highlighting the roles of data dimension, sample size, SNR, and network width.
Findings
Neural collapse depends on data dimension, sample size, and SNR, not just network width.
In two-layer networks, sufficient conditions for NC involve data properties.
In three-layer networks, NC occurs if the first layer is sufficiently wide.
Abstract
Neural collapse (NC) is a phenomenon that emerges at the terminal phase of the training (TPT) of deep neural networks (DNNs). The features of the data in the same class collapse to their respective sample means and the sample means exhibit a simplex equiangular tight frame (ETF). In the past few years, there has been a surge of works that focus on explaining why the NC occurs and how it affects generalization. Since the DNNs are notoriously difficult to analyze, most works mainly focus on the unconstrained feature model (UFM). While the UFM explains the NC to some extent, it fails to provide a complete picture of how the network architecture and the dataset affect NC. In this work, we focus on shallow ReLU neural networks and try to understand how the width, depth, data dimension, and statistical property of the training dataset influence the neural collapse. We provide a complete…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Focus
