TL;DR
This paper introduces a statistical mechanics model to analyze the high-dimensional weight space of deep neural networks, revealing a liquid-like structure that explains the effectiveness of depth in learning.
Contribution
It presents a novel least structured model capturing the global geometry of deep network weight spaces, including binary synapse networks, and clarifies the role of depth through high-dimensional geometry.
Findings
Under- and over-parameterized networks share similar connected weight spaces.
Shallow networks exhibit a broken, discontinuous weight space.
Deep networks contain a liquid-like central weight region.
Abstract
The geometric structure of an optimization landscape is argued to be fundamentally important to support the success of deep neural network learning. A direct computation of the landscape beyond two layers is hard. Therefore, to capture the global view of the landscape, an interpretable model of the network-parameter (or weight) space must be established. However, the model is lacking so far. Furthermore, it remains unknown what the landscape looks like for deep networks of binary synapses, which plays a key role in robust and energy efficient neuromorphic computation. Here, we propose a statistical mechanics framework by directly building a least structured model of the high-dimensional weight space, considering realistic structured data, stochastic gradient descent training, and the computational depth of neural networks. We also consider whether the number of network parameters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
