Theory II: Landscape of the Empirical Risk in Deep Learning
Qianli Liao, Tomaso Poggio

TL;DR
This paper combines theory and experiments to analyze the landscape of empirical risk in overparametrized deep convolutional neural networks, revealing many degenerate global minima and suggesting the loss surface may be simpler than previously thought.
Contribution
It characterizes the empirical risk landscape of overparametrized DCNNs, proving the existence of many degenerate global minima and visualizing the training process.
Findings
Existence of numerous degenerate global minima with zero empirical error.
The empirical risk landscape can be simpler than traditionally believed.
SGD tends to find the most robust zero-minimizer.
Abstract
Previous theoretical work on deep learning and neural network optimization tend to focus on avoiding saddle points and local minima. However, the practical observation is that, at least in the case of the most successful Deep Convolutional Neural Networks (DCNNs), practitioners can always increase the network size to fit the training data (an extreme example would be [1]). The most successful DCNNs such as VGG and ResNets are best used with a degree of "overparametrization". In this work, we characterize with a mix of theory and experiments, the landscape of the empirical risk of overparametrized DCNNs. We first prove in the regression framework the existence of a large number of degenerate global minimizers with zero empirical error (modulo inconsistent equations). The argument that relies on the use of Bezout theorem is rigorous when the RELUs are replaced by a polynomial nonlinearity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms
MethodsDropout · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Softmax · Convolution · Ethereum Customer Service Number +1-833-534-1729 · Stochastic Gradient Descent
