Almost Sure Asymptotic Freeness of Neural Network Jacobian with   Orthogonal Weights

Tomohiro Hayase

arXiv:1908.03901·math.PR·February 13, 2020

Almost Sure Asymptotic Freeness of Neural Network Jacobian with Orthogonal Weights

Tomohiro Hayase

PDF

Open Access

TL;DR

This paper proves that in wide neural networks with orthogonal Haar-distributed weights, the Jacobians become asymptotically free, aiding understanding of gradient behavior and training stability.

Contribution

It establishes the almost sure asymptotic freeness of layer-wise Jacobians in deep neural networks with orthogonal weight initialization, advancing free probability theory applications.

Findings

01

Jacobians become asymptotically free in the wide limit

02

Orthogonal Haar-distributed weights lead to well-conditioned Jacobian spectra

03

Results help improve understanding of gradient stability in deep networks

Abstract

A well-conditioned Jacobian spectrum has a vital role in preventing exploding or vanishing gradients and speeding up learning of deep neural networks. Free probability theory helps us to understand and handle the Jacobian spectrum. We rigorously show almost sure asymptotic freeness of layer-wise Jacobians of deep neural networks as the wide limit. In particular, we treat the case that weights are initialized as Haar distributed orthogonal matrices.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Markov Chains and Monte Carlo Methods · Stochastic Gradient Optimization Techniques