Randomly Initialized One-Layer Neural Networks Make Data Linearly   Separable

Promit Ghosal; Srinath Mahankali; Yihang Sun

arXiv:2205.11716·cs.LG·October 10, 2023

Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable

Promit Ghosal, Srinath Mahankali, Yihang Sun

PDF

Open Access

TL;DR

This paper proves that wide, randomly initialized one-layer neural networks can, with high probability, transform arbitrary data sets into linearly separable ones without training, offering theoretical insights into neural network capabilities.

Contribution

It establishes bounds on the width needed for random one-layer networks to achieve data separability, including a dimension-independent bound, advancing understanding of neural network initialization.

Findings

01

Randomly initialized networks can make data linearly separable without training.

02

Provided width bounds depend polynomially or are independent of input dimension.

03

Theoretical proof combines geometric and concentration of measure techniques.

Abstract

Recently, neural networks have demonstrated remarkable capabilities in mapping two arbitrary sets to two linearly separable sets. The prospect of achieving this with randomly initialized neural networks is particularly appealing due to the computational efficiency compared to fully trained networks. This paper contributes by establishing that, given sufficient width, a randomly initialized one-layer neural network can, with high probability, transform two sets into two linearly separable sets without any training. Moreover, we furnish precise bounds on the necessary width of the neural network for this phenomenon to occur. Our initial bound exhibits exponential dependence on the input dimension while maintaining polynomial dependence on all other parameters. In contrast, our second bound is independent of input dimension, effectively surmounting the curse of dimensionality. The main…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Face and Expression Recognition · Neural Networks and Applications