Provable Acceleration of Nesterov's Accelerated Gradient for Rectangular Matrix Factorization and Linear Neural Networks
Zhenghao Xu, Yuqing Wang, Tuo Zhao, Rachel Ward, Molei Tao

TL;DR
This paper proves that Nesterov's accelerated gradient method achieves the fastest known convergence rate for rectangular matrix factorization and linear neural networks, improving upon previous bounds with a novel unbalanced initialization.
Contribution
The paper establishes provable acceleration of Nesterov's method for nonconvex matrix factorization and neural networks, using a new unbalanced initialization strategy.
Findings
NAG attains an iteration complexity of O(κ log(1/ε)) for matrix factorization.
Unbalanced initialization enables faster convergence without large network widths.
Results extend to linear neural networks with minimal width requirements.
Abstract
We study the convergence rate of first-order methods for rectangular matrix factorization, which is a canonical nonconvex optimization problem. Specifically, given a rank- matrix , we prove that gradient descent (GD) can find a pair of -optimal solutions and , where , satisfying in iterations with high probability, where denotes the condition number of . Furthermore, we prove that Nesterov's accelerated gradient (NAG) attains an iteration complexity of , which is the best-known bound of first-order methods for rectangular matrix factorization.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
