AutoInit: Automatic Initialization via Jacobian Tuning
Tianyu He, Darshil Doshi, Andrey Gromov

TL;DR
AutoInit is a novel, efficient algorithm that automatically finds effective initializations for deep neural networks by tuning Jacobian properties, improving training stability and performance across various architectures.
Contribution
We introduce a new Jacobian-based method for automatic initialization of DNNs, applicable to diverse architectures, reducing reliance on trial-and-error.
Findings
Automatic initialization improves training stability.
Method performs well on ResMLP and VGG architectures.
Convergence conditions are derived for fully connected networks.
Abstract
Good initialization is essential for training Deep Neural Networks (DNNs). Oftentimes such initialization is found through a trial and error approach, which has to be applied anew every time an architecture is substantially modified, or inherited from smaller size networks leading to sub-optimal initialization. In this work we introduce a new and cheap algorithm, that allows one to find a good initialization automatically, for general feed-forward DNNs. The algorithm utilizes the Jacobian between adjacent network blocks to tune the network hyperparameters to criticality. We solve the dynamics of the algorithm for fully connected networks with ReLU and derive conditions for its convergence. We then extend the discussion to more general architectures with BatchNorm and residual connections. Finally, we apply our method to ResMLP and VGG architectures, where the automatic one-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Convolution · Max Pooling · Softmax · Feedforward Network · Residual Connection · Affine Operator · Dropout · Residual Multi-Layer Perceptrons
