Implicit Regularization Properties of Variance Reduced Stochastic Mirror Descent
Yiling Luo, Xiaoming Huo, Yajun Mei

TL;DR
This paper investigates the implicit regularization properties of the variance reduced stochastic mirror descent (VRSMD) algorithm, proving its convergence to the minimum mirror interpolant in linear regression and demonstrating its effectiveness in sparse model estimation.
Contribution
It establishes the implicit regularization property of VRSMD and provides theoretical and empirical insights into its performance in linear regression and sparse models.
Findings
VRSMD converges to the minimum mirror interpolant in linear regression.
VRSMD exhibits implicit regularization similar to gradient descent.
Numerical examples show VRSMD's empirical effectiveness.
Abstract
In machine learning and statistical data analysis, we often run into objective function that is a summation: the number of terms in the summation possibly is equal to the sample size, which can be enormous. In such a setting, the stochastic mirror descent (SMD) algorithm is a numerically efficient method -- each iteration involving a very small subset of the data. The variance reduction version of SMD (VRSMD) can further improve SMD by inducing faster convergence. On the other hand, algorithms such as gradient descent and stochastic gradient descent have the implicit regularization property that leads to better performance in terms of the generalization errors. Little is known on whether such a property holds for VRSMD. We prove here that the discrete VRSMD estimator sequence converges to the minimum mirror interpolant in the linear regression. This establishes the implicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
