How Can Deep Neural Networks Fail Even With Global Optima?

Qingguang Guan

arXiv:2407.16872·cs.LG·July 25, 2024

How Can Deep Neural Networks Fail Even With Global Optima?

Qingguang Guan

PDF

TL;DR

This paper investigates why deep neural networks can fail despite reaching global optima, showing that overfitting models with global minima can still perform poorly on classification and approximation tasks.

Contribution

It extends the expressive power of shallow networks to deep ones and constructs overfitting deep networks that fail despite having global optima.

Findings

01

Overfitting deep networks can still perform poorly.

02

Global optima do not guarantee good generalization.

03

Theoretical analysis supports empirical results.

Abstract

Fully connected deep neural networks are successfully applied to classification and function approximation problems. By minimizing the cost function, i.e., finding the proper weights and biases, models can be built for accurate predictions. The ideal optimization process can achieve global optima. However, do global optima always perform well? If not, how bad can it be? In this work, we aim to: 1) extend the expressive power of shallow neural networks to networks of any depth using a simple trick, 2) construct extremely overfitting deep neural networks that, despite having global optima, still fail to perform well on classification and function approximation problems. Different types of activation functions are considered, including ReLU, Parametric ReLU, and Sigmoid functions. Extensive theoretical analysis has been conducted, ranging from one-dimensional models to models of any…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia?