Approximation results for Gradient Descent trained Shallow Neural   Networks in $1d$

R. Gentile; G. Welper

arXiv:2209.08399·cs.LG·September 20, 2022·1 cites

Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$

R. Gentile, G. Welper

PDF

Open Access 1 Repo

TL;DR

This paper investigates the approximation capabilities of shallow neural networks in one dimension trained with gradient descent, balancing theoretical approximation results with practical training considerations.

Contribution

It provides the first approximation results for finite-width, shallow neural networks in 1D trained via gradient descent, highlighting the trade-offs in approximation rates.

Findings

01

Gradient descent can effectively train shallow networks for approximation in 1D.

02

Finite width networks achieve near-optimal approximation with some redundancy.

03

The approximation rate is slightly lower than the best possible due to non-overparametrization.

Abstract

Two aspects of neural networks that have been extensively studied in the recent literature are their function approximation properties and their training by gradient descent methods. The approximation problem seeks accurate approximations with a minimal number of weights. In most of the current literature these weights are fully or partially hand-crafted, showing the capabilities of neural networks but not necessarily their practical performance. In contrast, optimization theory for neural networks heavily relies on an abundance of weights in over-parametrized regimes. This paper balances these two demands and provides an approximation result for shallow networks in $1 d$ with non-convex weight optimization by gradient descent. We consider finite width networks and infinite sample limits, which is the typical setup in approximation theory. Technically, this problem is not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rustygentile/approx-trained
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms