How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong, Lijun Ding, Simon S. Du

TL;DR
This paper investigates how over-parameterization affects the convergence of gradient descent in matrix sensing, revealing slower rates in symmetric cases and faster linear convergence in asymmetric cases, with new theoretical bounds and methods.
Contribution
It provides the first global convergence results for over-parameterized matrix sensing in both symmetric and asymmetric settings, highlighting the impact of symmetry and initialization.
Findings
Over-parameterization leads to a slower $rac{1}{T^2}$ convergence in symmetric settings.
Asymmetric over-parameterization achieves linear convergence with rate depending on initialization scale.
A modified gradient descent step can recover rate independent of initialization scale.
Abstract
This paper rigorously shows how over-parameterization changes the convergence behaviors of gradient descent (GD) for the matrix sensing problem, where the goal is to recover an unknown low-rank ground-truth matrix from near-isotropic linear measurements. First, we consider the symmetric setting with the symmetric parameterization where is a positive semi-definite unknown matrix of rank , and one uses a symmetric parameterization to learn . Here with is the factor matrix. We give a novel lower bound of randomly initialized GD for the over-parameterized case () where is the number of iterations. This is in stark contrast to the exact-parameterization scenario () where the convergence rate is . Next, we study asymmetric setting where $M^* \in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Quantum Information and Cryptography · Orbital Angular Momentum in Optics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
