PDE Models for Deep Neural Networks: Learning Theory, Calculus of Variations and Optimal Control
Peter Markowich, Simone Portaro

TL;DR
This paper develops a PDE-based framework for deep neural networks by analyzing their continuum limits, enabling new insights into their training dynamics, stability, and architecture design through advanced mathematical tools.
Contribution
It introduces a PDE and optimal control framework for deep neural networks, extending previous models and providing new theoretical tools for analysis and design.
Findings
Established well-posedness of the PDE forward problem
Proved existence of viscosity solutions for the control problem
Developed PDE-based numerical methods for training neural networks
Abstract
We propose a partial differential-integral equation (PDE) framework for deep neural networks (DNNs) and their associated learning problem by taking the continuum limits of both network width and depth. The proposed model captures the complex interactions among hidden nodes, overcoming limitations of traditional discrete and ordinary differential equation (ODE)-based models. We explore the well-posedness of the forward propagation problem, analyze the existence and properties of minimizers for the learning task, and provide a detailed examination of necessary and sufficient conditions for the existence of critical points. Controllability and optimality conditions for the learning task with its associated PDE forward problem are established using variational calculus, the Pontryagin Maximum Principle, and the Hamilton-Jacobi-Bellman equation, framing the deep learning process as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks
