Approximation Results for Gradient Descent trained Neural Networks
G. Welper

TL;DR
This paper provides approximation guarantees for gradient flow-trained neural networks on Sobolev smooth targets, highlighting the role of the neural tangent kernel in an under-parametrized regime with constant depth and increasing width.
Contribution
It introduces new approximation guarantees for fully connected neural networks trained with gradient flow, using NTK analysis in an under-parametrized setting with Sobolev smooth targets.
Findings
Approximation guarantees are established in the $L_2$ norm on the sphere.
Results apply to networks with constant depth and increasing width.
The analysis reveals a trade-off between over-parametrization and approximation rate.
Abstract
The paper contains approximation guarantees for neural networks that are trained with gradient flow, with error measured in the continuous -norm on the -dimensional unit sphere and targets that are Sobolev smooth. The networks are fully connected of constant depth and increasing width. Although all layers are trained, the gradient flow convergence is based on a neural tangent kernel (NTK) argument for the non-convex second but last layer. Unlike standard NTK analysis, the continuous error norm implies an under-parametrized regime, possible by the natural smoothness assumption required for approximation. The typical over-parametrization re-enters the results in form of a loss in approximation rate relative to established approximation methods for Sobolev smooth functions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Radiomics and Machine Learning in Medical Imaging · Advanced Neural Network Applications
MethodsNeural Tangent Kernel
