Geometric structure of shallow neural networks and constructive ${\mathcal L}^2$ cost minimization

Thomas Chen; Patr\'icia Mu\~noz Ewald

arXiv:2309.10370·cs.LG·March 2, 2026

Geometric structure of shallow neural networks and constructive ${\mathcal L}^2$ cost minimization

Thomas Chen, Patr\'icia Mu\~noz Ewald

PDF

Open Access

TL;DR

This paper analyzes the geometric structure of shallow ReLU neural networks and provides explicit constructions for cost minimization without gradient descent, revealing bounds and local minima related to data structure.

Contribution

It introduces a geometric approach to construct upper bounds for cost minimization in shallow networks, explicitly characterizes local minima, and connects these to data structure without relying on gradient methods.

Findings

01

Upper bound on cost minimization of order O(δ_P)

02

Explicit local minimum for the case M=Q

03

Constructive network training that captures a Q-dimensional subspace

Abstract

In this paper, we approach the problem of cost (loss) minimization in underparametrized shallow ReLU networks through the explicit construction of upper bounds which appeal to the structure of classification data, without use of gradient descent. A key focus is on elucidating the geometric structure of approximate and precise minimizers. We consider an $L^{2}$ cost function, input space $R^{M}$ , output space $R^{Q}$ with $Q \leq M$ , and training input sample size that can be arbitrarily large. We prove an upper bound on the minimum of the cost function of order $O (δ_{P})$ where $δ_{P}$ measures the signal-to-noise ratio of training data. In the special case $M = Q$ , we explicitly determine an exact degenerate local minimum of the cost function, and show that the sharp value differs from the upper bound obtained for $Q \leq M$ by a relative error $O (δ_{P}^{2})$ . The proof…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Analysis Techniques · Neural Networks and Applications · Image and Object Detection Techniques

MethodsFocus