Asymptotic Analysis of Deep Residual Networks

Rama Cont; Alain Rossier; and Renyuan Xu

arXiv:2212.08199·cs.LG·January 26, 2023

Asymptotic Analysis of Deep Residual Networks

Rama Cont, Alain Rossier, and Renyuan Xu

PDF

Open Access

TL;DR

This paper analyzes the asymptotic behavior of deep Residual networks, revealing different scaling regimes and their limits, including ODEs and SDEs, as the number of layers grows large.

Contribution

It identifies new scaling regimes for ResNets and characterizes their limiting dynamics, including stochastic differential equations, which differ from previous neural ODE assumptions.

Findings

01

Existence of multiple scaling regimes for trained weights.

02

Convergence of hidden state dynamics to ODEs or SDEs depending on regime.

03

Derivation of scaling limits for backpropagation dynamics.

Abstract

We investigate the asymptotic properties of deep Residual networks (ResNets) as the number of layers increases. We first show the existence of scaling regimes for trained weights markedly different from those implicitly assumed in the neural ODE literature. We study the convergence of the hidden state dynamics in these scaling regimes, showing that one may obtain an ODE, a stochastic differential equation (SDE) or neither of these. In particular, our findings point to the existence of a diffusive regime in which the deep network limit is described by a class of stochastic differential equations (SDEs). Finally, we derive the corresponding scaling limits for the backpropagation dynamics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Applications