AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks
Garrett Bingham, Risto Miikkulainen

TL;DR
AutoInit is a versatile weight initialization method that analytically adapts to various neural network architectures, improving training stability and performance across diverse models and settings without data dependence.
Contribution
AutoInit introduces an automatic, analytically derived weight initialization scheme that adapts to different architectures and conditions, enhancing robustness and reliability.
Findings
Improves performance of CNNs, residual, and transformer networks.
More reliable than data-dependent methods across various settings.
Effective from small tasks to large datasets like ImageNet.
Abstract
Neural networks require careful weight initialization to prevent signals from exploding or vanishing. Existing initialization schemes solve this problem in specific cases by assuming that the network has a certain activation function or topology. It is difficult to derive such weight initialization strategies, and modern architectures therefore often use these same initialization schemes even though their assumptions do not hold. This paper introduces AutoInit, a weight initialization algorithm that automatically adapts to different neural network architectures. By analytically tracking the mean and variance of signals as they propagate through the network, AutoInit appropriately scales the weights at each layer to avoid exploding or vanishing signals. Experiments demonstrate that AutoInit improves performance of convolutional, residual, and transformer networks across a range of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning in Materials Science · Adversarial Robustness in Machine Learning
