On the matrix square root via geometric optimization
Suvrit Sra

TL;DR
This paper demonstrates that geometric optimization on the manifold of positive definite matrices offers a more effective approach for computing matrix square roots than traditional gradient descent, with faster convergence and better reliability.
Contribution
It introduces a new first-order method based on geodesic convexity that outperforms gradient descent and provides a clear convergence analysis, highlighting the benefits of the geometric view.
Findings
Newton-like methods compute matrix square roots rapidly and reliably.
Gradient descent converges slowly due to tiny step-sizes and ill-conditioning.
The geometric approach offers conceptual and practical advantages over Euclidean methods.
Abstract
This paper is triggered by the preprint "\emph{Computing Matrix Squareroot via Non Convex Local Search}" by Jain et al. (\textit{\textcolor{blue}{arXiv:1507.05854}}), which analyzes gradient-descent for computing the square root of a positive definite matrix. Contrary to claims of~\citet{jain2015}, our experiments reveal that Newton-like methods compute matrix square roots rapidly and reliably, even for highly ill-conditioned matrices and without requiring commutativity. We observe that gradient-descent converges very slowly primarily due to tiny step-sizes and ill-conditioning. We derive an alternative first-order method based on geodesic convexity: our method admits a transparent convergence analysis ( page), attains linear rate, and displays reliable convergence even for rank deficient problems. Though superior to gradient-descent, ultimately our method is also outperformed by a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
