Deep limits and cut-off phenomena for neural networks

Benny Avelin; Anders Karlsson

arXiv:2104.10727·cs.LG·April 23, 2021·1 cites

Deep limits and cut-off phenomena for neural networks

Benny Avelin, Anders Karlsson

PDF

Open Access

TL;DR

This paper explores the geometric and dynamical properties of deep neural networks, revealing limits and a cut-off phenomenon in network depth that influence training and initialization strategies.

Contribution

It introduces a geometric framework for analyzing deep networks, demonstrating the existence of limits and a surprising cut-off phenomenon related to network depth.

Findings

01

Existence of limits as the number of layers tends to infinity

02

Identification of a cut-off phenomenon in network depth

03

Implications for network initialization and architecture design

Abstract

We consider dynamical and geometrical aspects of deep learning. For many standard choices of layer maps we display semi-invariant metrics which quantify differences between data or decision functions. This allows us, when considering random layer maps and using non-commutative ergodic theorems, to deduce that certain limits exist when letting the number of layers tend to infinity. We also examine the random initialization of standard networks where we observe a surprising cut-off phenomenon in terms of the number of layers, the depth of the network. This could be a relevant parameter when choosing an appropriate number of layers for a given learning task, or for selecting a good initialization procedure. More generally, we hope that the notions and results in this paper can provide a framework, in particular a geometric one, for a part of the theoretical understanding of deep neural…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Topological and Geometric Data Analysis