Learning Deep Models: Critical Points and Local Openness
Maher Nouiehed, Meisam Razaviyayn

TL;DR
This paper introduces a unifying framework for analyzing the optimization landscapes of deep models, establishing conditions under which local optima are globally optimal, applicable to linear and certain non-linear neural networks.
Contribution
It provides a novel landscape analysis framework based on local openness, extending classical results to non-continuous loss functions and non-differentiable activations.
Findings
Characterization of local openness for matrix multiplication
Proof that local optima are global in two-layer linear networks without data assumptions
Global/local optima equivalence in certain over-parameterized deep models
Abstract
With the increasing popularity of non-convex deep models, developing a unifying theory for studying the optimization problems that arise from training these models becomes very significant. Toward this end, we present in this paper a unifying landscape analysis framework that can be used when the training objective function is the composite of simple functions. Using the local openness property of the underlying training models, we provide simple sufficient conditions under which any local optimum of the resulting optimization problem is globally optimal. We first completely characterize the local openness of the symmetric and non-symmetric matrix multiplication mapping . Then we use our characterization to: 1) provide a simple proof for the classical result of Burer-Monteiro and extend it to non-continuous loss functions. 2) Show that every local optimum of two layer linear networks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
