Convergence Analysis of Deep Residual Networks

Wentao Huang; Haizhang Zhang

arXiv:2205.06571·cs.LG·May 16, 2022

Convergence Analysis of Deep Residual Networks

Wentao Huang, Haizhang Zhang

PDF

Open Access

TL;DR

This paper provides a mathematical analysis of the convergence behavior of deep Residual Networks as their depth increases, offering insights into their design and theoretical foundations.

Contribution

It introduces a matrix-vector framework for analyzing ResNets and establishes sufficient conditions for their convergence as depth tends to infinity.

Findings

01

Established a convergence criterion for deep ResNets

02

Provided a mathematical justification for ResNet design

03

Verified theoretical results with experiments on benchmark data

Abstract

Various powerful deep neural network architectures have made great contribution to the exciting successes of deep learning in the past two decades. Among them, deep Residual Networks (ResNets) are of particular importance because they demonstrated great usefulness in computer vision by winning the first place in many deep learning competitions. Also, ResNets were the first class of neural networks in the development history of deep learning that are really deep. It is of mathematical interest and practical meaning to understand the convergence of deep ResNets. We aim at characterizing the convergence of deep ResNets as the depth tends to infinity in terms of the parameters of the networks. Toward this purpose, we first give a matrix-vector description of general deep neural networks with shortcut connections and formulate an explicit expression for the networks by using the notions of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Matrix Theory and Algorithms · Model Reduction and Neural Networks