Which Neural Net Architectures Give Rise To Exploding and Vanishing   Gradients?

Boris Hanin

arXiv:1801.03744·stat.ML·October 30, 2018·37 cites

Which Neural Net Architectures Give Rise To Exploding and Vanishing Gradients?

Boris Hanin

PDF

Open Access

TL;DR

This paper rigorously analyzes how the architecture of randomly initialized neural networks influences gradient behavior, revealing conditions that lead to exploding or vanishing gradients based on network width and architecture.

Contribution

It provides a rigorous statistical analysis of gradient behavior in randomly initialized fully connected ReLU networks, extending mean field theory with finite width corrections.

Findings

01

Gradient variance grows exponentially with architecture-dependent constant beta

02

Large beta causes gradients to vary wildly at initialization

03

Finite width corrections are computed at the edge of chaos

Abstract

We give a rigorous analysis of the statistical behavior of gradients in a randomly initialized fully connected network N with ReLU activations. Our results show that the empirical variance of the squares of the entries in the input-output Jacobian of N is exponential in a simple architecture-dependent constant beta, given by the sum of the reciprocals of the hidden layer widths. When beta is large, the gradients computed by N at initialization vary wildly. Our approach complements the mean field theory analysis of random networks. From this point of view, we rigorously compute finite width corrections to the statistics of gradients at the edge of chaos.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Neural dynamics and brain function · stochastic dynamics and bifurcation

Methods*Communicated@Fast*How Do I Communicate to Expedia?