Robust stochastic first order methods in heavy-tailed noise via medoid mini-batch gradient sampling

Manojlo Vukovic; Dusan Jakovetic

arXiv:2605.07634·math.OC·May 11, 2026

Robust stochastic first order methods in heavy-tailed noise via medoid mini-batch gradient sampling

Manojlo Vukovic, Dusan Jakovetic

PDF

TL;DR

This paper introduces R-SGD-Mini, a robust stochastic gradient method using medoid mini-batch sampling to handle heavy-tailed noise, with proven convergence rates and favorable experimental performance.

Contribution

The paper proposes a novel medoid-based mini-batch gradient sampling method for heavy-tailed noise, providing explicit convergence bounds and high-probability guarantees.

Findings

01

R-SGD-Mini converges at rate O(T^{-1}) in expectation.

02

The method achieves a rate of O(T^{-1/2}) when the time horizon is known.

03

Experimental results favor R-SGD-Mini over traditional methods.

Abstract

We consider a first order stochastic optimization framework where, at each iteration, $K$ independent identically distributed (i.i.d.) data point samples are drawn, based on which stochastic gradients can be queried. We allow gradient noise to be heavy-tailed, with possibly infinite variances. For the considered heavy-tailed setting, many algorithmic variants have recently been proposed based on gradient clipping or other nonlinear operators (e.g., normalization) applied over noisy gradients. In this paper, we take an alternative approach and propose a novel stochastic first order method dubbed Robust Stochastic Gradient Descent with medoid mini-batch gradient sampling, R-SGD-Mini for short. The core idea of R-SGD-Mini is to split the $K$ -sized data batch into $M$ distinct data chunks, form for each chunk the stochastic gradient, and update the solution estimate with respect to the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.