Asymptotic Analysis of Deep Residual Networks
Rama Cont, Alain Rossier, and Renyuan Xu

TL;DR
This paper analyzes the asymptotic behavior of deep Residual networks, revealing different scaling regimes and their limits, including ODEs and SDEs, as the number of layers grows large.
Contribution
It identifies new scaling regimes for ResNets and characterizes their limiting dynamics, including stochastic differential equations, which differ from previous neural ODE assumptions.
Findings
Existence of multiple scaling regimes for trained weights.
Convergence of hidden state dynamics to ODEs or SDEs depending on regime.
Derivation of scaling limits for backpropagation dynamics.
Abstract
We investigate the asymptotic properties of deep Residual networks (ResNets) as the number of layers increases. We first show the existence of scaling regimes for trained weights markedly different from those implicitly assumed in the neural ODE literature. We study the convergence of the hidden state dynamics in these scaling regimes, showing that one may obtain an ODE, a stochastic differential equation (SDE) or neither of these. In particular, our findings point to the existence of a diffusive regime in which the deep network limit is described by a class of stochastic differential equations (SDEs). Finally, we derive the corresponding scaling limits for the backpropagation dynamics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Applications
