Explorations on high dimensional landscapes
Levent Sagun, V. Ugur Guney, Gerard Ben Arous, Yann LeCun

TL;DR
This paper investigates the structure of high-dimensional non-convex functions, revealing a narrow band of critical points and demonstrating that gradient methods can efficiently reach this region in deep networks.
Contribution
It provides empirical evidence of a narrow critical point band in high dimensions and extends theoretical results from spin glasses to deep neural networks.
Findings
High-dimensional functions have a narrow band of critical points.
Gradient descent reaches this band efficiently in deep networks.
Simulations align with theoretical predictions for spin glasses.
Abstract
Finding minima of a real valued non-convex function over a high dimensional space is a major challenge in science. We provide evidence that some such functions that are defined on high dimensional domains have a narrow band of values whose pre-image contains the bulk of its critical points. This is in contrast with the low dimensional picture in which this band is wide. Our simulations agree with the previous theoretical work on spin glasses that proves the existence of such a band when the dimension of the domain tends to infinity. Furthermore our experiments on teacher-student networks with the MNIST dataset establish a similar phenomenon in deep networks. We finally observe that both the gradient descent and the stochastic gradient descent methods can reach this level within the same number of steps.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Human Mobility and Location-Based Analysis · Land Use and Ecosystem Services
