Full error analysis for the training of deep neural networks

Christan Beck; Arnulf Jentzen; Benno Kuckuck

arXiv:1910.00121·math.NA·February 10, 2023

Full error analysis for the training of deep neural networks

Christan Beck, Arnulf Jentzen, Benno Kuckuck

PDF

Open Access

TL;DR

This paper provides a comprehensive error analysis for deep neural network training, decomposing the total error into approximation, generalization, and optimization errors, and establishing convergence with a slow, dimension-dependent speed.

Contribution

It introduces a full error decomposition framework for deep learning algorithms, combining the three main error sources into a unified convergence analysis.

Findings

01

Estimates each error component separately

02

Combines errors to analyze overall convergence

03

Shows convergence speed is slow and dimension-dependent

Abstract

Deep learning algorithms have been applied very successfully in recent years to a range of problems out of reach for classical solution paradigms. Nevertheless, there is no completely rigorous mathematical error and convergence analysis which explains the success of deep learning algorithms. The error of a deep learning algorithm can in many situations be decomposed into three parts, the approximation error, the generalization error, and the optimization error. In this work we estimate for a certain deep learning algorithm each of these three errors and combine these three error estimates to obtain an overall error analysis for the deep learning algorithm under consideration. In particular, we thereby establish convergence with a suitable convergence speed for the overall error of the deep learning algorithm under consideration. Our convergence speed analysis is far from optimal and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Non-Destructive Testing Techniques · Stochastic Gradient Optimization Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings