PDE Models for Deep Neural Networks: Learning Theory, Calculus of   Variations and Optimal Control

Peter Markowich; Simone Portaro

arXiv:2411.06290·math.OC·November 12, 2024

PDE Models for Deep Neural Networks: Learning Theory, Calculus of Variations and Optimal Control

Peter Markowich, Simone Portaro

PDF

Open Access

TL;DR

This paper develops a PDE-based framework for deep neural networks by analyzing their continuum limits, enabling new insights into their training dynamics, stability, and architecture design through advanced mathematical tools.

Contribution

It introduces a PDE and optimal control framework for deep neural networks, extending previous models and providing new theoretical tools for analysis and design.

Findings

01

Established well-posedness of the PDE forward problem

02

Proved existence of viscosity solutions for the control problem

03

Developed PDE-based numerical methods for training neural networks

Abstract

We propose a partial differential-integral equation (PDE) framework for deep neural networks (DNNs) and their associated learning problem by taking the continuum limits of both network width and depth. The proposed model captures the complex interactions among hidden nodes, overcoming limitations of traditional discrete and ordinary differential equation (ODE)-based models. We explore the well-posedness of the forward propagation problem, analyze the existence and properties of minimizers for the learning task, and provide a detailed examination of necessary and sufficient conditions for the existence of critical points. Controllability and optimality conditions for the learning task with its associated PDE forward problem are established using variational calculus, the Pontryagin Maximum Principle, and the Hamilton-Jacobi-Bellman equation, framing the deep learning process as a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks