Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model
Connall Garrod, Jonathan P. Keating

TL;DR
This paper provides an analytic explanation for the low-dimensional structures observed in deep learning models' weights, Hessians, and gradients, linking them to deep neural collapse and layerwise properties, validated through theory and experiments.
Contribution
It introduces a unified analysis of low-dimensional phenomena in deep networks using deep unconstrained feature models, revealing the role of neural collapse and deriving explicit spectral properties.
Findings
Eigenvalues and eigenvectors expressed in terms of class means
Hessian inherits low-dimensional structure from layerwise Hessians
Empirical validation in UFMs and deep networks
Abstract
Empirical studies have revealed low dimensional structures in the eigenspectra of weights, Hessians, gradients, and feature vectors of deep networks, consistently observed across datasets and architectures in the overparameterized regime. In this work, we analyze deep unconstrained feature models (UFMs) to provide an analytic explanation of how these structures emerge at the layerwise level, including the bulk outlier Hessian spectrum and the alignment of gradient descent with the outlier eigenspace. We show that deep neural collapse underlies these phenomena, deriving explicit expressions for eigenvalues and eigenvectors of many deep learning matrices in terms of class feature means. Furthermore, we demonstrate that the full Hessian inherits its low dimensional structure from the layerwise Hessians, and empirically validate our theory in both UFMs and deep networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction
