Noise Regularizes Over-parameterized Rank One Matrix Recovery, Provably
Tianyi Liu, Yan Li, Enlu Zhou, Tuo Zhao

TL;DR
This paper demonstrates that noise in gradient descent algorithms acts as a regularizer, significantly improving the accuracy of over-parameterized rank one matrix recovery by reducing the mean square error proportionally to the noise variance.
Contribution
It provides a theoretical analysis showing how noise regularizes over-parameterized models, leading to better recovery accuracy compared to noise-free methods.
Findings
Random perturbation reduces mean square error to O(σ^2/d)
Gradient descent without noise attains mean square error of O(σ^2)
Noise acts as an implicit regularizer in over-parameterized models
Abstract
We investigate the role of noise in optimization algorithms for learning over-parameterized models. Specifically, we consider the recovery of a rank one matrix from a noisy observation using an over-parameterization model. We parameterize the rank one matrix by , where . We then show that under mild conditions, the estimator, obtained by the randomly perturbed gradient descent algorithm using the square loss function, attains a mean square error of , where is the variance of the observational noise. In contrast, the estimator obtained by gradient descent without random perturbation only attains a mean square error of . Our result partially justifies the implicit regularization effect of noise when learning over-parameterized models, and provides new understanding of training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Neural Networks and Applications
