Convergence of Deep ReLU Networks
Yuesheng Xu, Haizhang Zhang

TL;DR
This paper investigates the mathematical convergence properties of deep ReLU neural networks as their depth increases, providing conditions for convergence and insights into residual network design.
Contribution
It introduces activation domains and matrices, linking network convergence to infinite matrix products, and establishes necessary and sufficient conditions for deep ReLU network convergence.
Findings
Convergence requires weight matrices to approach identity and biases to zero.
Pointwise convergence is guaranteed under specific conditions on weights and biases.
Results inform the design of deep residual networks in image classification.
Abstract
We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replacing applications of the ReLU activation function by multiplications with activation matrices on activation domains, we obtain an explicit expression of the ReLU network. We then identify the convergence of the ReLU networks as convergence of a class of infinite products of matrices. Sufficient and necessary conditions for convergence of these infinite products of matrices are studied. As a result, we establish necessary conditions for ReLU networks to converge that the sequence of weight matrices converges to the identity matrix and the sequence of the bias vectors converges to zero as the depth of ReLU networks increases to infinity.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Numerical methods in inverse problems · Medical Image Segmentation Techniques
