On the Initialisation of Wide Low-Rank Feedforward Neural Networks

Thiziri Nait Saada; Jared Tanner

arXiv:2301.13710·stat.ML·February 1, 2023

On the Initialisation of Wide Low-Rank Feedforward Neural Networks

Thiziri Nait Saada, Jared Tanner

PDF

Open Access

TL;DR

This paper analyzes the dynamics of wide low-rank feedforward neural networks at initialization, providing formulas for optimal variances and insights into how rank affects Jacobian variance, aiding efficient network initialization.

Contribution

It extends optimal initialization formulas from full-rank to low-rank networks and elucidates how rank influences network dynamics and initialization strategies.

Findings

01

Optimal weight and bias variances derived for low-rank networks

02

Jacobian variance increases as rank-to-width ratio decreases

03

Guidelines for initializing low-rank networks efficiently

Abstract

The edge-of-chaos dynamics of wide randomly initialized low-rank feedforward networks are analyzed. Formulae for the optimal weight and bias variances are extended from the full-rank to low-rank setting and are shown to follow from multiplicative scaling. The principle second order effect, the variance of the input-output Jacobian, is derived and shown to increase as the rank to width ratio decreases. These results inform practitioners how to randomly initialize feedforward networks with a reduced number of learnable parameters while in the same ambient dimension, allowing reductions in the computational cost and memory constraints of the associated network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM