Visualizing high-dimensional loss landscapes with Hessian directions
Lucas B\"ottcher, Gregory Wheeler

TL;DR
This paper investigates how high-dimensional loss landscapes of neural networks can be visualized and understood through the lens of differential geometry and random projections, revealing limitations of random projections and proposing Hessian-based methods.
Contribution
It introduces a novel analysis linking curvature in high-dimensional loss spaces to lower-dimensional projections and proposes Hessian-based projections for better landscape visualization.
Findings
Random projections often misidentify saddle points in loss landscapes.
Hessian-based projections better capture true curvature directions.
The methods scale to models with millions of parameters.
Abstract
Analyzing geometric properties of high-dimensional loss functions, such as local curvature and the existence of other optima around a certain point in loss space, can help provide a better understanding of the interplay between neural network structure, implementation attributes, and learning performance. In this work, we combine concepts from high-dimensional probability and differential geometry to study how curvature properties in lower-dimensional loss representations depend on those in the original loss space. We show that saddle points in the original space are rarely correctly identified as such in expected lower-dimensional representations if random projections are used. The principal curvature in the expected lower-dimensional representation is proportional to the mean curvature in the original loss space. Hence, the mean curvature in the original loss space determines if…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Advanced Neuroimaging Techniques and Applications
