Dropping Convexity for Faster Semi-definite Optimization

Srinadh Bhojanapalli; Anastasios Kyrillidis; Sujay Sanghavi

arXiv:1509.03917·stat.ML·April 19, 2016·50 cites

Dropping Convexity for Faster Semi-definite Optimization

Srinadh Bhojanapalli, Anastasios Kyrillidis, Sujay Sanghavi

PDF

Open Access

TL;DR

This paper analyzes the convergence of Factored Gradient Descent (FGD) for semi-definite optimization, showing it achieves rates comparable to standard gradient descent and providing initialization strategies for global convergence.

Contribution

It introduces convergence guarantees for FGD on general convex functions, including step size rules and initialization procedures, under standard convex assumptions.

Findings

01

FGD attains $O(1/k)$ convergence for smooth functions.

02

FGD converges exponentially fast for strongly convex functions.

03

Proper initialization ensures global convergence in certain cases.

Abstract

We study the minimization of a convex function $f (X)$ over the set of $n \times n$ positive semi-definite matrices, but when the problem is recast as $min_{U} g (U) := f (U U^{⊤})$ , with $U \in R^{n \times r}$ and $r \leq n$ . We study the performance of gradient descent on $g$ ---which we refer to as Factored Gradient Descent (FGD)---under standard assumptions on the original function $f$ . We provide a rule for selecting the step size and, with this choice, show that the local convergence rate of FGD mirrors that of standard gradient descent on the original $f$ : i.e., after $k$ steps, the error is $O (1/ k)$ for smooth $f$ , and exponentially small in $k$ when $f$ is (restricted) strongly convex. In addition, we provide a procedure to initialize FGD for (restricted) strongly convex objectives and when one only has access to $f$ via a first-order oracle; for several problem instances,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Advanced Optimization Algorithms Research