Structured First-Layer Initialization Pre-Training Techniques to Accelerate Training Process Based on $\varepsilon$-Rank
Tao Tang, Jiang Yang, Yuxiang Zhao, Quanhui Zhu

TL;DR
This paper introduces a structured first-layer initialization method that enhances neural feature diversity at the start, leading to faster training, better convergence, and improved accuracy in scientific computing neural networks.
Contribution
It proposes a novel SFLI pre-training technique that constructs $oldsymbol{ ext{ extit{ extepsilon}}}$-linearly independent neurons, improving training efficiency across various architectures.
Findings
SFLI increases initial $ ext{ extit{ extepsilon}}$-rank and accelerates convergence.
The method mitigates spectral bias and improves prediction accuracy.
Implementation requires only one line of code addition.
Abstract
Training deep neural networks for scientific computing remains computationally expensive due to the slow formation of diverse feature representations in early training stages. Recent studies identify a staircase phenomenon in training dynamics, where loss decreases are closely correlated with increases in -rank, reflecting the effective number of linearly independent neuron functions. Motivated by this observation, this work proposes a structured first-layer initialization (SFLI) pre-training method to enhance the diversity of neural features at initialization by constructing -linearly independent neurons in the input layer. We present systematic initialization schemes compatible with various activation functions and integrate the strategy into multiple neural architectures, including modified multi-layer perceptrons and physics-informed residual adaptive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTechnology and Data Analysis · Engineering and Test Systems · Educational Technology and Assessment
