Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Alon Brutzkus, Amir Globerson

TL;DR
This paper proves that for a specific one-hidden-layer convolutional neural network with Gaussian inputs, gradient descent can find the global optimum efficiently, providing the first such guarantee for ReLU CNNs.
Contribution
It establishes the first polynomial-time convergence guarantee of gradient descent to the global optimum for a convolutional neural network with ReLU activations under Gaussian input distribution.
Findings
Gradient descent converges to the global optimum in polynomial time for Gaussian inputs.
Learning is NP-complete in the general case without Gaussian assumptions.
First global optimality guarantee for ReLU CNNs with convolutional structure.
Abstract
Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
Methods*Communicated@Fast*How Do I Communicate to Expedia?
