The Persistence of Neural Collapse Despite Low-Rank Bias
Connall Garrod, Jonathan P. Keating

TL;DR
This paper investigates why neural collapse persists in deep networks despite theoretical predictions of its suboptimality, revealing a low-rank bias and prevalence of DNC in the loss landscape.
Contribution
It extends prior theoretical results to cross-entropy loss, characterizes the low-rank bias, and explains the frequent empirical occurrence of neural collapse.
Findings
High-rank structures are not optimal under cross-entropy loss.
A fixed bound on the number of significant singular values at minima.
DNC is more prevalent in the loss landscape than other configurations.
Abstract
Neural collapse (NC) and its multi-layer variant, deep neural collapse (DNC), describe a structured geometry that occurs in the features and weights of trained deep networks. Recent theoretical work by Sukenik et al. using a deep unconstrained feature model (UFM) suggests that DNC is suboptimal under mean squared error (MSE) loss. They heuristically argue that this is due to low-rank bias induced by L2 regularization. In this work, we extend this result to deep UFMs trained with cross-entropy loss, showing that high-rank structures, including DNC, are not generally optimal. We characterize the associated low-rank bias, proving a fixed bound on the number of non-negligible singular values at global minima as network depth increases. We further analyze the loss surface, demonstrating that DNC is more prevalent in the landscape than other critical configurations, which we argue explains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
