Preconditioned Gradient Descent for Overparameterized Nonconvex Burer--Monteiro Factorization with Global Optimality Certification
Gavin Zhang, Salar Fattahi, Richard Y. Zhang

TL;DR
This paper introduces a preconditioner for gradient descent that accelerates convergence in overparameterized nonconvex Burer--Monteiro problems, enabling efficient global optimality certification despite ill-conditioning.
Contribution
The authors propose an inexpensive preconditioner that restores linear convergence rates in overparameterized settings, independent of ill-conditioning and overparameterization.
Findings
Preconditioned gradient descent achieves linear convergence in overparameterized cases.
The method is robust to ill-conditioning of the global minimizer.
Overparameterization no longer slows convergence with the proposed preconditioner.
Abstract
We consider using gradient descent to minimize the nonconvex function over an factor matrix , in which is an underlying smooth convex cost function defined over matrices. While only a second-order stationary point can be provably found in reasonable time, if is additionally rank deficient, then its rank deficiency certifies it as being globally optimal. This way of certifying global optimality necessarily requires the search rank of the current iterate to be overparameterized with respect to the rank of the global minimizer . Unfortunately, overparameterization significantly slows down the convergence of gradient descent, from a linear rate with to a sublinear rate when , even when is strongly convex. In this paper, we propose an inexpensive preconditioner that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Optimization Algorithms Research · Sparse and Compressive Sensing Techniques · Matrix Theory and Algorithms
