Negative eigenvalues of the Hessian in deep neural networks
Guillaume Alain, Nicolas Le Roux, Pierre-Antoine Manzagol

TL;DR
This paper investigates the non-convex loss landscape of deep neural networks by analyzing Hessian eigenvalues, emphasizing the significance of negative eigenvalues and their impact on training dynamics.
Contribution
It provides a detailed analysis of the Hessian eigenvalues in deep networks and explores the effects of negative eigenvalues on optimization.
Findings
Negative eigenvalues are prevalent in deep network loss landscapes.
Handling negative eigenvalues can improve training stability.
The Hessian's eigendecomposition reveals insights into non-convexity.
Abstract
The loss function of deep networks is known to be non-convex but the precise nature of this nonconvexity is still an active area of research. In this work, we study the loss landscape of deep networks through the eigendecompositions of their Hessian matrix. In particular, we examine how important the negative eigenvalues are and the benefits one can observe in handling them appropriately.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Matrix Theory and Algorithms
