Controlling the Flow: Stability and Convergence for Stochastic Gradient Descent with Decaying Regularization
Sebastian Kassing, Simon Weissmann, Leif D\"oring

TL;DR
This paper investigates a variant of stochastic gradient descent with decaying Tikhonov regularization, proving its convergence to the minimum-norm solution and analyzing the optimal decay strategies for stable learning in convex problems.
Contribution
It introduces a regularized SGD method with time-dependent regularization, providing convergence proofs and insights into stability and rate optimization for convex minimization.
Findings
Proves strong convergence of reg-SGD to minimum-norm solution.
Quantifies convergence rates and optimal regularization decay.
Validates theoretical results with numerical experiments on inverse problems.
Abstract
The present article studies the minimization of convex, L-smooth functions defined on a separable real Hilbert space. We analyze regularized stochastic gradient descent (reg-SGD), a variant of stochastic gradient descent that uses a Tikhonov regularization with time-dependent, vanishing regularization parameter. We prove strong convergence of reg-SGD to the minimum-norm solution of the original problem without additional boundedness assumptions. Moreover, we quantify the rate of convergence and optimize the interplay between step-sizes and regularization decay. Our analysis reveals how vanishing Tikhonov regularization controls the flow of SGD and yields stable learning dynamics, offering new insights into the design of iterative algorithms for convex problems, including those that arise in ill-posed inverse problems. We validate our theoretical findings through numerical experiments on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Numerical methods in inverse problems · Sparse and Compressive Sensing Techniques
MethodsStochastic Gradient Descent
