Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable
Promit Ghosal, Srinath Mahankali, Yihang Sun

TL;DR
This paper proves that wide, randomly initialized one-layer neural networks can, with high probability, transform arbitrary data sets into linearly separable ones without training, offering theoretical insights into neural network capabilities.
Contribution
It establishes bounds on the width needed for random one-layer networks to achieve data separability, including a dimension-independent bound, advancing understanding of neural network initialization.
Findings
Randomly initialized networks can make data linearly separable without training.
Provided width bounds depend polynomially or are independent of input dimension.
Theoretical proof combines geometric and concentration of measure techniques.
Abstract
Recently, neural networks have demonstrated remarkable capabilities in mapping two arbitrary sets to two linearly separable sets. The prospect of achieving this with randomly initialized neural networks is particularly appealing due to the computational efficiency compared to fully trained networks. This paper contributes by establishing that, given sufficient width, a randomly initialized one-layer neural network can, with high probability, transform two sets into two linearly separable sets without any training. Moreover, we furnish precise bounds on the necessary width of the neural network for this phenomenon to occur. Our initial bound exhibits exponential dependence on the input dimension while maintaining polynomial dependence on all other parameters. In contrast, our second bound is independent of input dimension, effectively surmounting the curse of dimensionality. The main…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Face and Expression Recognition · Neural Networks and Applications
