Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture

Yikun Hou; Suvrit Sra; Alp Yurtsever

arXiv:2501.16322·cs.LG·November 4, 2025

Implicit Bias in Matrix Factorization and its Explicit Realization in a New Architecture

Yikun Hou, Suvrit Sra, Alp Yurtsever

PDF

Open Access 1 Repo

TL;DR

This paper investigates the implicit low-rank bias in matrix factorization under gradient descent and introduces a new model with explicit constraints that reliably produces low-rank solutions, extending the concept to neural networks.

Contribution

The paper presents a novel factorization model with constrained factors and a diagonal component, explicitly capturing the implicit bias and extending it to neural network architectures.

Findings

01

The new model consistently yields truly low-rank solutions.

02

Experiments demonstrate the model's strong implicit bias and stability.

03

The neural network extension achieves competitive performance with low-rank representations.

Abstract

Gradient descent for matrix factorization exhibits an implicit bias toward approximately low-rank solutions. While existing theories often assume the boundedness of iterates, empirically the bias persists even with unbounded sequences. This reflects a dynamic where factors develop low-rank structure while their magnitudes increase, tending to align with certain directions. To capture this behavior in a stable way, we introduce a new factorization model: $X \approx U D V^{⊤}$ , where $U$ and $V$ are constrained within norm balls, while $D$ is a diagonal factor allowing the model to span the entire search space. Experiments show that this model consistently exhibits a strong implicit bias, yielding truly (rather than approximately) low-rank solutions. Extending the idea to neural networks, we introduce a new model featuring constrained layers and diagonal components that achieves competitive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Titanium-H/UDV
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topicsgraph theory and CDMA systems