Average gradient outer product as a mechanism for deep neural collapse
Daniel Beaglehole, Peter S\'uken\'ik, Marco Mondelli, Mikhail Belkin

TL;DR
This paper introduces a data-dependent explanation for Deep Neural Collapse (DNC) based on the average gradient outer product (AGOP), demonstrating its role in feature learning and collapse phenomena in deep neural networks.
Contribution
It proposes the AGOP as a mechanism for DNC, introduces the Deep RFM model, and provides both empirical and theoretical evidence linking AGOP to neural collapse.
Findings
DNC occurs in Deep RFM due to AGOP-based projections
Theoretical explanation of DNC via kernel learning and asymptotic analysis
Singular vectors of weights relate to within-class variability collapse
Abstract
Deep Neural Collapse (DNC) refers to the surprisingly rigid structure of the data representations in the final layers of Deep Neural Networks (DNNs). Though the phenomenon has been measured in a variety of settings, its emergence is typically explained via data-agnostic approaches, such as the unconstrained features model. In this work, we introduce a data-dependent setting where DNC forms due to feature learning through the average gradient outer product (AGOP). The AGOP is defined with respect to a learned predictor and is equal to the uncentered covariance matrix of its input-output gradients averaged over the training dataset. The Deep Recursive Feature Machine (Deep RFM) is a method that constructs a neural network by iteratively mapping the data with the AGOP and applying an untrained random feature map. We demonstrate empirically that DNC occurs in Deep RFM across standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsConnexins and lens biology · Traumatic Brain Injury and Neurovascular Disturbances · S100 Proteins and Annexins
