Analytic Insights into Structure and Rank of Neural Network Hessian Maps
Sidak Pal Singh, Gregor Bachmann, Thomas Hofmann

TL;DR
This paper provides a theoretical analysis of the Hessian matrix in neural networks, revealing its rank deficiency and structural properties, which enhances understanding of model redundancy and informs network design.
Contribution
We develop exact formulas and bounds for the Hessian rank in deep linear networks and extend these insights to nonlinear networks, offering a structural understanding of Hessian properties.
Findings
Exact formulas for Hessian rank in deep linear networks
Bounds on Hessian rank applicable to nonlinear networks
Insights into how architecture affects Hessian rank deficiency
Abstract
The Hessian of a neural network captures parameter interactions through second-order derivatives of the loss. It is a fundamental object of study, closely tied to various problems in deep learning, including model design, optimization, and generalization. Most prior work has been empirical, typically focusing on low-rank approximations and heuristics that are blind to the network structure. In contrast, we develop theoretical tools to analyze the range of the Hessian map, providing us with a precise understanding of its rank deficiency as well as the structural reasons behind it. This yields exact formulas and tight upper bounds for the Hessian rank of deep linear networks, allowing for an elegant interpretation in terms of rank deficiency. Moreover, we demonstrate that our bounds remain faithful as an estimate of the numerical Hessian rank, for a larger class of models such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeural Networks and Applications · Medical Image Segmentation Techniques · Face and Expression Recognition
