Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model

Connall Garrod; Jonathan P. Keating

arXiv:2404.06106·cs.LG·January 27, 2026·2 cites

Unifying Low Dimensional Observations in Deep Learning Through the Deep Linear Unconstrained Feature Model

Connall Garrod, Jonathan P. Keating

PDF

Open Access

TL;DR

This paper provides an analytic explanation for the low-dimensional structures observed in deep learning models' weights, Hessians, and gradients, linking them to deep neural collapse and layerwise properties, validated through theory and experiments.

Contribution

It introduces a unified analysis of low-dimensional phenomena in deep networks using deep unconstrained feature models, revealing the role of neural collapse and deriving explicit spectral properties.

Findings

01

Eigenvalues and eigenvectors expressed in terms of class means

02

Hessian inherits low-dimensional structure from layerwise Hessians

03

Empirical validation in UFMs and deep networks

Abstract

Empirical studies have revealed low dimensional structures in the eigenspectra of weights, Hessians, gradients, and feature vectors of deep networks, consistently observed across datasets and architectures in the overparameterized regime. In this work, we analyze deep unconstrained feature models (UFMs) to provide an analytic explanation of how these structures emerge at the layerwise level, including the bulk outlier Hessian spectrum and the alignment of gradient descent with the outlier eigenspace. We show that deep neural collapse underlies these phenomena, deriving explicit expressions for eigenvalues and eigenvectors of many deep learning matrices in terms of class feature means. Furthermore, we demonstrate that the full Hessian inherits its low dimensional structure from the layerwise Hessians, and empirically validate our theory in both UFMs and deep networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction