Continuous limits of residual neural networks in case of large input   data

M. Herty; A. Thuenen; T. Trimborn; G. Visconti

arXiv:2112.14150·math.AP·May 11, 2022·1 cites

Continuous limits of residual neural networks in case of large input data

M. Herty, A. Thuenen, T. Trimborn, G. Visconti

PDF

Open Access

TL;DR

This paper explores the mathematical limits of residual neural networks with large input data, deriving mean-field descriptions and analyzing training dynamics through optimal control, supported by numerical experiments.

Contribution

It introduces a mean-field limit for ResNets with large input data and studies the training process using controllability and optimal control frameworks.

Findings

01

Derived a mean-field limit for large-scale ResNets

02

Proved well-posedness of the neural differential equations

03

Numerical simulations support theoretical results

Abstract

Residual deep neural networks (ResNets) are mathematically described as interacting particle systems. In the case of infinitely many layers the ResNet leads to a system of coupled system of ordinary differential equations known as neural differential equations. For large scale input data we derive a mean--field limit and show well--posedness of the resulting description. Further, we analyze the existence of solutions to the training process by using both a controllability and an optimal control point of view. Numerical investigations based on the solution of a formal optimality system illustrate the theoretical findings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Advanced Numerical Methods in Computational Mathematics · Numerical methods for differential equations