Strong overall error analysis for the training of artificial neural   networks via random initializations

Arnulf Jentzen; Adrian Riekert

arXiv:2012.08443·cs.LG·April 13, 2023

Strong overall error analysis for the training of artificial neural networks via random initializations

Arnulf Jentzen, Adrian Riekert

PDF

Open Access

TL;DR

This paper provides improved error estimates for training deep neural networks with random initializations, showing that network depth can grow more slowly while maintaining approximation rates, applicable to stochastic optimization.

Contribution

It offers a partial improvement on existing convergence estimates, demonstrating that neural network depth need not increase rapidly for effective approximation with random initializations.

Findings

01

Depth growth rate can be reduced for the same approximation accuracy

02

Results apply to arbitrary stochastic optimization algorithms with i.i.d. initializations

03

Improved theoretical understanding of error convergence in deep learning

Abstract

Although deep learning based approximation algorithms have been applied very successfully to numerous problems, at the moment the reasons for their performance are not entirely understood from a mathematical point of view. Recently, estimates for the convergence of the overall error have been obtained in the situation of deep supervised learning, but with an extremely slow rate of convergence. In this note we partially improve on these estimates. More specifically, we show that the depth of the neural network only needs to increase much slower in order to obtain the same rate of approximation. The results hold in the case of an arbitrary stochastic optimization algorithm with i.i.d.\ random initializations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks