Global Convergence of Stochastic Gradient Descent for Some Non-convex   Matrix Problems

Christopher De Sa; Kunle Olukotun; and Christopher R\'e

arXiv:1411.1134·cs.LG·February 11, 2015·57 cites

Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems

Christopher De Sa, Kunle Olukotun, and Christopher R\'e

PDF

Open Access

TL;DR

This paper proves that a modified stochastic gradient descent algorithm for low-rank matrix problems converges globally from random initialization under broad sampling conditions, with practical experiments demonstrating its efficiency.

Contribution

The paper introduces a step size scheme for SGD on low-rank least-squares problems and proves its global convergence from random start under broad sampling conditions.

Findings

01

Convergence within $O(rac{1}{\u03b5} n \u2206 )$ steps with high probability.

02

Relation of the modified SGD to stochastic power iteration.

03

Experimental validation of runtime and convergence.

Abstract

Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a low-rank least-squares problem, and we prove that, under broad sampling conditions, our method converges globally from a random starting point within $O (ϵ^{- 1} n lo g n)$ steps with constant probability for constant-rank problems. Our modification of SGD relates it to stochastic power iteration. We also show experiments to illustrate the runtime and convergence of the algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Matrix Theory and Algorithms

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent