Global Convergence of Stochastic Gradient Descent for Some Non-convex Matrix Problems
Christopher De Sa, Kunle Olukotun, and Christopher R\'e

TL;DR
This paper proves that a modified stochastic gradient descent algorithm for low-rank matrix problems converges globally from random initialization under broad sampling conditions, with practical experiments demonstrating its efficiency.
Contribution
The paper introduces a step size scheme for SGD on low-rank least-squares problems and proves its global convergence from random start under broad sampling conditions.
Findings
Convergence within $O(rac{1}{\u03b5} n \u2206 )$ steps with high probability.
Relation of the modified SGD to stochastic power iteration.
Experimental validation of runtime and convergence.
Abstract
Stochastic gradient descent (SGD) on a low-rank factorization is commonly employed to speed up matrix problems including matrix completion, subspace tracking, and SDP relaxation. In this paper, we exhibit a step size scheme for SGD on a low-rank least-squares problem, and we prove that, under broad sampling conditions, our method converges globally from a random starting point within steps with constant probability for constant-rank problems. Our modification of SGD relates it to stochastic power iteration. We also show experiments to illustrate the runtime and convergence of the algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Matrix Theory and Algorithms
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent
