Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Alon Brutzkus; Amir Globerson

arXiv:1702.07966·cs.LG·February 28, 2017·80 cites

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Alon Brutzkus, Amir Globerson

PDF

Open Access

TL;DR

This paper proves that for a specific one-hidden-layer convolutional neural network with Gaussian inputs, gradient descent can find the global optimum efficiently, providing the first such guarantee for ReLU CNNs.

Contribution

It establishes the first polynomial-time convergence guarantee of gradient descent to the global optimum for a convolutional neural network with ReLU activations under Gaussian input distribution.

Findings

01

Gradient descent converges to the global optimum in polynomial time for Gaussian inputs.

02

Learning is NP-complete in the general case without Gaussian assumptions.

03

First global optimality guarantee for ReLU CNNs with convolutional structure.

Abstract

Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM

Methods*Communicated@Fast*How Do I Communicate to Expedia?