Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
Yaniv Blumenfeld, Dar Gilboa, Daniel Soudry

TL;DR
This paper demonstrates that neural networks can be effectively trained from identical feature initializations, challenging the belief that feature diversity is essential, by leveraging non-deterministic GPU operations for symmetry breaking.
Contribution
It introduces a deep convolutional network initialized with nearly identical weights that still trains successfully, highlighting the role of non-deterministic GPU operations in symmetry breaking.
Findings
Identical feature initialization can still lead to high-accuracy training.
Non-deterministic GPU operations serve as sufficient symmetry breakers.
Diversity in initial features is not strictly necessary for effective training.
Abstract
Deep neural networks are typically initialized with random weights, with variances chosen to facilitate signal propagation and stable gradients. It is also believed that diversity of features is an important property of these initializations. We construct a deep convolutional network with identical features by initializing almost all the weights to . The architecture also enables perfect signal propagation and stable gradients, and achieves high accuracy on standard benchmarks. This indicates that random, diverse initializations are \textit{not} necessary for training neural networks. An essential element in training this network is a mechanism of symmetry breaking; we study this phenomenon and find that standard GPU operations, which are non-deterministic, can serve as a sufficient source of symmetry breaking to enable training.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Stochastic Gradient Optimization Techniques
