On the Initialisation of Wide Low-Rank Feedforward Neural Networks
Thiziri Nait Saada, Jared Tanner

TL;DR
This paper analyzes the dynamics of wide low-rank feedforward neural networks at initialization, providing formulas for optimal variances and insights into how rank affects Jacobian variance, aiding efficient network initialization.
Contribution
It extends optimal initialization formulas from full-rank to low-rank networks and elucidates how rank influences network dynamics and initialization strategies.
Findings
Optimal weight and bias variances derived for low-rank networks
Jacobian variance increases as rank-to-width ratio decreases
Guidelines for initializing low-rank networks efficiently
Abstract
The edge-of-chaos dynamics of wide randomly initialized low-rank feedforward networks are analyzed. Formulae for the optimal weight and bias variances are extended from the full-rank to low-rank setting and are shown to follow from multiplicative scaling. The principle second order effect, the variance of the input-output Jacobian, is derived and shown to increase as the rank to width ratio decreases. These results inform practitioners how to randomly initialize feedforward networks with a reduced number of learnable parameters while in the same ambient dimension, allowing reductions in the computational cost and memory constraints of the associated network.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM
