Learning One-hidden-layer Neural Networks with Landscape Design

Rong Ge; Jason D. Lee; Tengyu Ma

arXiv:1711.00501·cs.LG·November 6, 2017·114 cites

Learning One-hidden-layer Neural Networks with Landscape Design

Rong Ge, Jason D. Lee, Tengyu Ma

PDF

Open Access 1 Video

TL;DR

This paper develops a theoretical framework for learning one-hidden-layer neural networks with Gaussian inputs, providing a landscape analysis of a specially designed objective function that guarantees convergence to the true parameters.

Contribution

The paper introduces a novel non-convex objective function with a landscape that ensures all local minima are global and correspond to true parameters, enabling provable learning guarantees.

Findings

01

All local minima are global minima.

02

Gradient descent converges to the true parameters.

03

Finite sample complexity is established.

Abstract

We consider the problem of learning a one-hidden-layer neural network: we assume the input $x \in R^{d}$ is from Gaussian distribution and the label $y = a^{⊤} σ (B x) + ξ$ , where $a$ is a nonnegative vector in $R^{m}$ with $m \leq d$ , $B \in R^{m \times d}$ is a full-rank weight matrix, and $ξ$ is a noise vector. We first give an analytic formula for the population risk of the standard squared loss and demonstrate that it implicitly attempts to decompose a sequence of low-rank tensors simultaneously. Inspired by the formula, we design a non-convex objective function $G (\cdot)$ whose landscape is guaranteed to have the following properties: 1. All local minima of $G$ are also global minima. 2. All global minima of $G$ correspond to the ground truth parameters. 3. The value and gradient of $G$ can be estimated using samples. With these properties,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning One-hidden-layer Neural Networks with Landscape Design· youtube

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Machine Learning and ELM