Failures of Gradient-Based Deep Learning
Shai Shalev-Shwartz, Ohad Shamir, Shaked Shammah

TL;DR
This paper investigates the limitations of gradient-based algorithms in deep learning by identifying four problem types where they fail, supported by experiments and theoretical analysis to understand and address these issues.
Contribution
It introduces four simple problem types where gradient-based methods fail, providing both experimental evidence and theoretical insights to improve understanding and potential remedies.
Findings
Gradient-based algorithms fail on certain simple problems
Theoretical explanations clarify sources of failures
Proposed insights suggest possible remedies
Abstract
In recent years, Deep Learning has become the go-to solution for a broad range of applications, often outperforming state-of-the-art. However, it is important, for both theoreticians and practitioners, to gain a deeper understanding of the difficulties and limitations associated with common approaches and algorithms. We describe four types of simple problems, for which the gradient-based algorithms commonly used in deep learning either fail or suffer from significant difficulties. We illustrate the failures through practical experiments, and provide theoretical insights explaining their source, and how they might be remedied.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms
