Learning Deep Models: Critical Points and Local Openness

Maher Nouiehed; Meisam Razaviyayn

arXiv:1803.02968·math.OC·August 7, 2023·INFORMS J. Optim.

Learning Deep Models: Critical Points and Local Openness

Maher Nouiehed, Meisam Razaviyayn

PDF

TL;DR

This paper introduces a unifying framework for analyzing the optimization landscapes of deep models, establishing conditions under which local optima are globally optimal, applicable to linear and certain non-linear neural networks.

Contribution

It provides a novel landscape analysis framework based on local openness, extending classical results to non-continuous loss functions and non-differentiable activations.

Findings

01

Characterization of local openness for matrix multiplication

02

Proof that local optima are global in two-layer linear networks without data assumptions

03

Global/local optima equivalence in certain over-parameterized deep models

Abstract

With the increasing popularity of non-convex deep models, developing a unifying theory for studying the optimization problems that arise from training these models becomes very significant. Toward this end, we present in this paper a unifying landscape analysis framework that can be used when the training objective function is the composite of simple functions. Using the local openness property of the underlying training models, we provide simple sufficient conditions under which any local optimum of the resulting optimization problem is globally optimal. We first completely characterize the local openness of the symmetric and non-symmetric matrix multiplication mapping . Then we use our characterization to: 1) provide a simple proof for the classical result of Burer-Monteiro and extend it to non-continuous loss functions. 2) Show that every local optimum of two layer linear networks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.