LION-DG: Layer-Informed Initialization with Deep Gradient Protocols for Accelerated Neural Network Training
Hyunjun Kim

TL;DR
LION-DG introduces a layer-informed initialization method for deep neural networks that stabilizes early training, accelerates convergence, and requires no hyperparameters, demonstrated on CIFAR datasets with DenseNet-DS and ResNet-DS.
Contribution
It proposes a novel layer-informed initialization scheme that implements Gradient Awakening, improving training speed and stability without additional hyperparameters or computational cost.
Findings
DenseNet-DS: +8.3% faster convergence on CIFAR-10
Hybrid LSUV + LION-DG achieves 81.92% accuracy on CIFAR-10
ResNet-DS: +11.3% speedup on CIFAR-100
Abstract
Weight initialization remains decisive for neural network optimization, yet existing methods are largely layer-agnostic. We study initialization for deeply-supervised architectures with auxiliary classifiers, where untrained auxiliary heads can destabilize early training through gradient interference. We propose LION-DG, a layer-informed initialization that zero-initializes auxiliary classifier heads while applying standard He-initialization to the backbone. We prove that this implements Gradient Awakening: auxiliary gradients are exactly zero at initialization, then phase in naturally as weights grow -- providing an implicit warmup without hyperparameters. Experiments on CIFAR-10 and CIFAR-100 with DenseNet-DS and ResNet-DS architectures demonstrate: (1) DenseNet-DS: +8.3% faster convergence on CIFAR-10 with comparable accuracy, (2) Hybrid approach: Combining LSUV with LION-DG…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques
