On the matrix square root via geometric optimization

Suvrit Sra

arXiv:1507.08366·math.NA·December 17, 2015

On the matrix square root via geometric optimization

Suvrit Sra

PDF

TL;DR

This paper demonstrates that geometric optimization on the manifold of positive definite matrices offers a more effective approach for computing matrix square roots than traditional gradient descent, with faster convergence and better reliability.

Contribution

It introduces a new first-order method based on geodesic convexity that outperforms gradient descent and provides a clear convergence analysis, highlighting the benefits of the geometric view.

Findings

01

Newton-like methods compute matrix square roots rapidly and reliably.

02

Gradient descent converges slowly due to tiny step-sizes and ill-conditioning.

03

The geometric approach offers conceptual and practical advantages over Euclidean methods.

Abstract

This paper is triggered by the preprint "\emph{Computing Matrix Squareroot via Non Convex Local Search}" by Jain et al. (\textit{\textcolor{blue}{arXiv:1507.05854}}), which analyzes gradient-descent for computing the square root of a positive definite matrix. Contrary to claims of~\citet{jain2015}, our experiments reveal that Newton-like methods compute matrix square roots rapidly and reliably, even for highly ill-conditioned matrices and without requiring commutativity. We observe that gradient-descent converges very slowly primarily due to tiny step-sizes and ill-conditioning. We derive an alternative first-order method based on geodesic convexity: our method admits a transparent convergence analysis ( $< 1$ page), attains linear rate, and displays reliable convergence even for rank deficient problems. Though superior to gradient-descent, ultimately our method is also outperformed by a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.