Geometric structure of Deep Learning networks and construction of global ${\mathcal L}^2$ minimizers
Thomas Chen, Patricia Mu\~noz Ewald

TL;DR
This paper explicitly constructs and analyzes the geometric structure of local and global minimizers of the $\
Contribution
It provides a direct construction of minimizers in underparametrized deep networks without gradient flow, revealing their degeneracy and structure.
Findings
Explicit family of global minimizers for $L \,\geq \,Q$
Identification of $2^Q-1$ degenerate local minima
Reinterpretation of layer concatenation as a recursive truncation map
Abstract
In this paper, we explicitly determine local and global minimizers of the cost function in underparametrized Deep Learning (DL) networks; our main goal is to shed light on their geometric structure and properties. We accomplish this by a direct construction, without invoking the gradient descent flow at any point of this work. We specifically consider hidden layers, a ReLU ramp activation function, an Schatten class (or Hilbert-Schmidt) cost function, input and output spaces with equal dimension , and hidden layers also defined on ; the training inputs are assumed to be sufficiently clustered. The training input size can be arbitrarily large - thus, we are considering the underparametrized regime. More general settings are left to future work. We construct an explicit family of minimizers for the global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Numerical Analysis Techniques · Topology Optimization in Engineering · Medical Image Segmentation Techniques
