Inducing Neural Collapse in Imbalanced Learning: Do We Really Need a Learnable Classifier at the End of Deep Neural Network?
Yibo Yang, Shixiang Chen, Xiangtai Li, Liang Xie, Zhouchen Lin,, Dacheng Tao

TL;DR
This paper investigates whether fixing a neural network classifier to an ETF structure can induce neural collapse and improve training efficiency in imbalanced datasets, challenging the necessity of learnable classifiers.
Contribution
The study demonstrates that a fixed ETF classifier can induce neural collapse and enhance training in imbalanced data, replacing the need for a learnable classifier and cross entropy loss.
Findings
Fixed ETF classifier induces neural collapse in imbalanced data
Replacing cross entropy with squared loss improves convergence
Method achieves faster training and better performance on multiple datasets
Abstract
Modern deep neural networks for classification usually jointly learn a backbone for representation and a linear classifier to output the logit of each class. A recent study has shown a phenomenon called neural collapse that the within-class means of features and the classifier vectors converge to the vertices of a simplex equiangular tight frame (ETF) at the terminal phase of training on a balanced dataset. Since the ETF geometric structure maximally separates the pair-wise angles of all classes in the classifier, it is natural to raise the question, why do we spend an effort to learn a classifier when we know its optimal geometric structure? In this paper, we study the potential of learning a neural network for classification with the classifier randomly initialized as an ETF and fixed during training. Our analytical work based on the layer-peeled model indicates that the feature…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsImbalanced Data Classification Techniques · Infrastructure Maintenance and Monitoring · Domain Adaptation and Few-Shot Learning
