Implicit Bias and Convergence of Matrix Stochastic Mirror Descent

Danil Akhtiamov; Reza Ghane; Omead Pooladzandi; Babak Hassibi

arXiv:2602.18997·stat.ML·March 2, 2026

Implicit Bias and Convergence of Matrix Stochastic Mirror Descent

Danil Akhtiamov, Reza Ghane, Omead Pooladzandi, Babak Hassibi

PDF

Open Access

TL;DR

This paper analyzes how stochastic mirror descent with matrix parameters converges exponentially to a global interpolator in overparameterized settings, revealing the influence of mirror maps on inductive bias in high-dimensional problems.

Contribution

It extends implicit bias results from vector to matrix stochastic mirror descent, showing convergence to solutions minimizing Bregman divergence in overparameterized regimes.

Findings

01

SMD with matrix parameters converges exponentially to a global interpolator.

02

The convergence point minimizes Bregman divergence from initialization.

03

Matrix mirror functions influence the inductive bias in multi-output learning.

Abstract

We investigate Stochastic Mirror Descent (SMD) with matrix parameters and vector-valued predictions, a framework relevant to multi-class classification and matrix completion problems. Focusing on the overparameterized regime, where the total number of parameters exceeds the number of training samples, we prove that SMD with matrix mirror functions $ψ (\cdot)$ converges exponentially to a global interpolator. Furthermore, we generalize classical implicit bias results of vector SMD by demonstrating that the matrix SMD algorithm converges to the unique solution minimizing the Bregman divergence induced by $ψ (\cdot)$ from initialization subject to interpolating the data. These findings reveal how matrix mirror maps dictate inductive bias in high-dimensional, multi-output problems.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Face and Expression Recognition