Neural Collapse in Deep Linear Networks: From Balanced to Imbalanced Data
Hien Dang, Tho Tran, Stanley Osher, Hung Tran-The, Nhat Ho, and Tan Nguyen

TL;DR
This paper proves that neural collapse, a phenomenon where last-layer features and classifiers form specific geometric structures, occurs in deep linear networks for both balanced and imbalanced data, supported by theoretical and empirical results.
Contribution
It extends the understanding of neural collapse by proving its occurrence in deep linear networks with MSE and CE losses, including imbalanced data scenarios, and provides geometric analysis of the phenomenon.
Findings
Neural collapse occurs in deep linear networks at global minima.
Last-layer features and classifiers form orthogonal structures influenced by class data.
Empirical validation confirms theoretical predictions on synthetic and real networks.
Abstract
Modern deep neural networks have achieved impressive performance on tasks from image classification to natural language processing. Surprisingly, these complex systems with massive amounts of parameters exhibit the same structural properties in their last-layer features and classifiers across canonical datasets when training until convergence. In particular, it has been observed that the last-layer features collapse to their class-means, and those class-means are the vertices of a simplex Equiangular Tight Frame (ETF). This phenomenon is known as Neural Collapse (NC). Recent papers have theoretically shown that NC emerges in the global minimizers of training problems with the simplified "unconstrained feature model". In this context, we take a step further and prove the NC occurrences in deep linear networks for the popular mean squared error (MSE) and cross entropy (CE) losses, showing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Imbalanced Data Classification Techniques · Advanced Neural Network Applications
