Is Feature Diversity Necessary in Neural Network Initialization?
Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

TL;DR
This paper investigates the role of feature diversity at initialization in neural networks, showing that lack of diversity hampers training but can be mitigated by noise, and that identical features can still lead to successful training.
Contribution
It demonstrates that feature diversity is not strictly necessary for training neural networks, challenging prior assumptions, and introduces methods to compensate for low diversity at initialization.
Findings
Lack of feature diversity harms training performance.
Adding small noise can counteract low diversity effects.
Identical features with near-zero weights can still train effectively.
Abstract
Standard practice in training neural networks involves initializing the weights in an independent fashion. The results of recent work suggest that feature "diversity" at initialization plays an important role in training the network. However, other initialization schemes with reduced feature diversity have also been shown to be viable. In this work, we conduct a series of experiments aimed at elucidating the importance of feature diversity at initialization. We show that a complete lack of diversity is harmful to training, but its effects can be counteracted by a relatively small addition of noise - even the noise in standard non-deterministic GPU computations is sufficient. Furthermore, we construct a deep convolutional network with identical features at initialization and almost all of the weights initialized at 0 that can be trained to reach accuracy matching its standard-initialized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Neural Networks and Applications
