Genericity of Polyak-Lojasiewicz Inequalities for Entropic Mean-Field Neural ODEs
Samuel Daudin, Fran\c{c}ois Delarue

TL;DR
This paper demonstrates that in continuum-layer ResNets with entropic regularization, a local Polyak-Lojasiewicz inequality holds generically, ensuring exponential convergence of gradient descent near stable global minimizers.
Contribution
It proves the genericity of the Polyak-Lojasiewicz inequality for entropic mean-field ResNets, leading to convergence guarantees in a continuum-layer setting.
Findings
Existence of unique stable global minimizers for generic initial data.
Stable minimizers satisfy a local Polyak-Lojasiewicz inequality.
Gradient descent converges exponentially near these minimizers.
Abstract
We address the behavior of idealized deep residual neural networks (ResNets), modeled via an optimal control problem set over continuity (or adjoint transport) equations. The continuity equations describe the statistical evolution of the features in the asymptotic regime where the layers of the network form a continuum. The velocity field is expressed through the network activation function, which is itself viewed as a function of the statistical distribution of the network parameters (weights and biases). From a mathematical standpoint, the control is interpreted in a relaxed sense, taking values in the space of probability measures over the set of parameters. We investigate the optimal behavior of the network when the cost functional arises from a regression problem and includes an additional entropic regularization term on the distribution of the parameters. In this framework, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
