Global Convergence of Non-Convex Gradient Descent for Computing Matrix Squareroot
Prateek Jain, Chi Jin, Sham M. Kakade, Praneeth Netrapalli

TL;DR
This paper proves the first global convergence and good rate of gradient descent for computing the square root of a positive definite matrix, a fundamental problem in numerical linear algebra with broad applications.
Contribution
It establishes the first global convergence result with a near-optimal rate for non-convex gradient descent in matrix square root computation, including robustness to errors.
Findings
Gradient descent finds an ε-approximate square root in O(α log(‖M-U₀²‖_F/ε)) iterations.
The convergence rate is robust to iteration errors.
The proof technique may extend to other non-convex optimization problems.
Abstract
While there has been a significant amount of work studying gradient descent techniques for non-convex optimization problems over the last few years, all existing results establish either local convergence with good rates or global convergence with highly suboptimal rates, for many problems of interest. In this paper, we take the first step in getting the best of both worlds -- establishing global convergence and obtaining a good rate of convergence for the problem of computing squareroot of a positive definite (PD) matrix, which is a widely studied problem in numerical linear algebra with applications in machine learning and statistics among others. Given a PD matrix and a PD starting point , we show that gradient descent with appropriately chosen step-size finds an -accurate squareroot of in iterations, where $\alpha =…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Advanced Optimization Algorithms Research
