Learning One-hidden-layer Neural Networks with Landscape Design
Rong Ge, Jason D. Lee, Tengyu Ma

TL;DR
This paper develops a theoretical framework for learning one-hidden-layer neural networks with Gaussian inputs, providing a landscape analysis of a specially designed objective function that guarantees convergence to the true parameters.
Contribution
The paper introduces a novel non-convex objective function with a landscape that ensures all local minima are global and correspond to true parameters, enabling provable learning guarantees.
Findings
All local minima are global minima.
Gradient descent converges to the true parameters.
Finite sample complexity is established.
Abstract
We consider the problem of learning a one-hidden-layer neural network: we assume the input is from Gaussian distribution and the label , where is a nonnegative vector in with , is a full-rank weight matrix, and is a noise vector. We first give an analytic formula for the population risk of the standard squared loss and demonstrate that it implicitly attempts to decompose a sequence of low-rank tensors simultaneously. Inspired by the formula, we design a non-convex objective function whose landscape is guaranteed to have the following properties: 1. All local minima of are also global minima. 2. All global minima of correspond to the ground truth parameters. 3. The value and gradient of can be estimated using samples. With these properties,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Learning One-hidden-layer Neural Networks with Landscape Design· youtube
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Machine Learning and ELM
