Implicit Regularization in Deep Matrix Factorization
Sanjeev Arora, Nadav Cohen, Wei Hu, Yuping Luo

TL;DR
This paper investigates how gradient descent in deep linear neural networks implicitly promotes low-rank solutions in matrix completion tasks, revealing that increased depth enhances this regularization and questioning the adequacy of traditional regularizers.
Contribution
It provides theoretical and experimental evidence that deeper networks induce stronger low-rank bias and challenges the sufficiency of standard regularizers to describe this implicit regularization.
Findings
Deeper networks tend to produce lower-rank solutions.
Implicit regularization cannot be fully captured by simple norms.
Depth enhances the accuracy of matrix recovery.
Abstract
Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low "complexity." We study the implicit regularization of gradient descent over deep linear neural networks for matrix completion and sensing, a model referred to as deep matrix factorization. Our first finding, supported by theory and experiments, is that adding depth to a matrix factorization enhances an implicit tendency towards low-rank solutions, oftentimes leading to more accurate recovery. Secondly, we present theoretical and empirical arguments questioning a nascent view by which implicit regularization in matrix factorization can be captured using simple mathematical norms. Our results point to the possibility that the language of standard regularizers may not be rich enough to fully…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Numerical methods in inverse problems
