Investigating generalization capabilities of neural networks by means of loss landscapes and Hessian analysis
Nikita Gabdullin

TL;DR
This paper introduces an improved loss landscape analysis method using Hessian spectra to evaluate neural network generalization, demonstrating that spectral criteria correlate with accuracy across dataset shifts.
Contribution
The study presents novel visualization techniques and quantitative criteria based on Hessian analysis for assessing neural network generalization capabilities.
Findings
Hessian spectra exhibit consistent patterns across various neural networks.
Proposed spectral criteria correlate with model accuracy on different datasets.
Hessian axes can improve loss landscape visualization in networks with batch normalization.
Abstract
This paper studies generalization capabilities of neural networks (NNs) using new and improved PyTorch library Loss Landscape Analysis (LLA). LLA facilitates visualization and analysis of loss landscapes along with the properties of NN Hessian. Different approaches to NN loss landscape plotting are discussed with particular focus on normalization techniques showing that conventional methods cannot always ensure correct visualization when batch normalization layers are present in NN architecture. The use of Hessian axes is shown to be able to mitigate this effect, and methods for choosing Hessian axes are proposed. In addition, spectra of Hessian eigendecomposition are studied and it is shown that typical spectra exist for a wide range of NNs. This allows to propose quantitative criteria for Hessian analysis that can be applied to evaluate NN performance and assess its generalization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsBatch Normalization · Lib · Focus
