Zero loss guarantees and explicit minimizers for generic overparametrized Deep Learning networks
Thomas Chen, Andrew G. Moore

TL;DR
This paper establishes conditions under which overparametrized deep neural networks can achieve zero training loss, providing explicit minimizers and analyzing the impact of depth on optimization efficiency.
Contribution
It offers explicit constructions of zero loss solutions and analyzes how network depth affects gradient-based optimization in overparametrized deep learning.
Findings
Overparametrization guarantees zero loss under certain conditions.
Explicit zero loss minimizers can be constructed without gradient descent.
Increasing depth may impair gradient descent efficiency due to Jacobian rank loss.
Abstract
We determine sufficient conditions for overparametrized deep learning (DL) networks to guarantee the attainability of zero loss in the context of supervised learning, for the cost and {\em generic} training data. We present an explicit construction of the zero loss minimizers without invoking gradient descent. On the other hand, we point out that increase of depth can deteriorate the efficiency of cost minimization using a gradient descent algorithm by analyzing the conditions for rank loss of the training Jacobian. Our results clarify key aspects on the dichotomy between zero loss reachability in underparametrized versus overparametrized DL.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Neural Networks and Applications · Adversarial Robustness in Machine Learning
