Understanding Incremental Learning of Gradient Descent: A Fine-grained   Analysis of Matrix Sensing

Jikai Jin; Zhiyuan Li; Kaifeng Lyu; Simon S. Du; Jason D.; Lee

arXiv:2301.11500·cs.LG·January 30, 2023·1 cites

Understanding Incremental Learning of Gradient Descent: A Fine-grained Analysis of Matrix Sensing

Jikai Jin, Zhiyuan Li, Kaifeng Lyu, Simon S. Du, Jason D., Lee

PDF

Open Access

TL;DR

This paper offers a detailed analysis of how gradient descent incrementally learns low-rank matrices in the matrix sensing problem, revealing its similarity to greedy heuristics and extending understanding beyond over-parameterized models.

Contribution

It provides a comprehensive characterization of the entire GD learning process for matrix sensing, including under-parameterized regimes, which was not previously analyzed.

Findings

01

GD with small initialization mimics greedy low-rank learning

02

GD learns solutions with increasing rank sequentially

03

The analysis applies to both over- and under-parameterized regimes

Abstract

It is believed that Gradient Descent (GD) induces an implicit bias towards good generalization in training machine learning models. This paper provides a fine-grained analysis of the dynamics of GD for the matrix sensing problem, whose goal is to recover a low-rank ground-truth matrix from near-isotropic linear measurements. It is shown that GD with small initialization behaves similarly to the greedy low-rank learning heuristics (Li et al., 2020) and follows an incremental learning procedure (Gissin et al., 2019): GD sequentially learns solutions with increasing ranks until it recovers the ground truth matrix. Compared to existing works which only analyze the first learning phase for rank-1 solutions, our result provides characterizations for the whole learning process. Moreover, besides the over-parameterized regime that many prior works focused on, our analysis of the incremental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Machine Learning and ELM · Stochastic Gradient Optimization Techniques