Leveraging Intermediate Neural Collapse with Simplex ETFs for Efficient   Deep Neural Networks

Emily Liu

arXiv:2412.00884·cs.LG·December 3, 2024

Leveraging Intermediate Neural Collapse with Simplex ETFs for Efficient Deep Neural Networks

Emily Liu

PDF

Open Access

TL;DR

This paper explores how enforcing neural collapse via simplex ETFs across all layers can improve neural network training efficiency and reduce parameters without sacrificing accuracy.

Contribution

It introduces Adaptive-ETF and ETF-Transformer methods that apply simplex ETF constraints to all layers, enhancing training efficiency and parameter reduction.

Findings

01

Achieve comparable performance with fewer parameters.

02

Enforce neural collapse across multiple layers.

03

Reduce model complexity without accuracy loss.

Abstract

Neural collapse is a phenomenon observed during the terminal phase of neural network training, characterized by the convergence of network activations, class means, and linear classifier weights to a simplex equiangular tight frame (ETF), a configuration of vectors that maximizes mutual distance within a subspace. This phenomenon has been linked to improved interpretability, robustness, and generalization in neural networks. However, its potential to guide neural network training and regularization remains underexplored. Previous research has demonstrated that constraining the final layer of a neural network to a simplex ETF can reduce the number of trainable parameters without sacrificing model accuracy. Furthermore, deep fully connected networks exhibit neural collapse not only in the final layer but across all layers beyond a specific effective depth. Using these insights, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications