Failures of Gradient-Based Deep Learning

Shai Shalev-Shwartz; Ohad Shamir; Shaked Shammah

arXiv:1703.07950·cs.LG·April 28, 2017·70 cites

Failures of Gradient-Based Deep Learning

Shai Shalev-Shwartz, Ohad Shamir, Shaked Shammah

PDF

Open Access 1 Repo

TL;DR

This paper investigates the limitations of gradient-based algorithms in deep learning by identifying four problem types where they fail, supported by experiments and theoretical analysis to understand and address these issues.

Contribution

It introduces four simple problem types where gradient-based methods fail, providing both experimental evidence and theoretical insights to improve understanding and potential remedies.

Findings

01

Gradient-based algorithms fail on certain simple problems

02

Theoretical explanations clarify sources of failures

03

Proposed insights suggest possible remedies

Abstract

In recent years, Deep Learning has become the go-to solution for a broad range of applications, often outperforming state-of-the-art. However, it is important, for both theoreticians and practitioners, to gain a deeper understanding of the difficulties and limitations associated with common approaches and algorithms. We describe four types of simple problems, for which the gradient-based algorithms commonly used in deep learning either fail or suffer from significant difficulties. We illustrate the failures through practical experiments, and provide theoretical insights explaining their source, and how they might be remedied.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shakedshammah/failures_of_DL
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms