Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped   SGD

Aniket Das; Dheeraj Nagaraj; Soumyabrata Pal; Arun Suggala; Prateek; Varshney

arXiv:2410.20135·stat.ML·October 29, 2024

Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped SGD

Aniket Das, Dheeraj Nagaraj, Soumyabrata Pal, Arun Suggala, Prateek, Varshney

PDF

Open Access

TL;DR

This paper demonstrates that Clipped-SGD achieves near-optimal statistical rates for high-dimensional heavy-tailed data in streaming settings, extending its effectiveness beyond strongly convex functions.

Contribution

The authors establish near-optimal convergence rates for Clipped-SGD in heavy-tailed streaming data, introducing a new iterative martingale concentration technique.

Findings

01

Clipped-SGD attains near-optimal sub-Gaussian rates with finite second moment of gradients.

02

The error bound improves previous rates by reducing dependence on $rac{1}{\delta}$.

03

Results extend to smooth and Lipschitz convex objectives.

Abstract

We consider the problem of high-dimensional heavy-tailed statistical estimation in the streaming setting, which is much harder than the traditional batch setting due to memory constraints. We cast this problem as stochastic convex optimization with heavy tailed stochastic gradients, and prove that the widely used Clipped-SGD algorithm attains near-optimal sub-Gaussian statistical rates whenever the second moment of the stochastic gradient noise is finite. More precisely, with $T$ samples, we show that Clipped-SGD, for smooth and strongly convex objectives, achieves an error of $\frac{Tr ( Σ ) + Tr ( Σ ) ∥Σ ∥ _{2} l o g ( \frac{l o g ( T )}{δ} )}{T}$ with probability $1 - δ$ , where $Σ$ is the covariance of the clipped gradient. Note that the fluctuations (depending on $\frac{1}{δ}$ ) are of lower order than the term $Tr (Σ)$ . This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDistributed Sensor Networks and Detection Algorithms · Sparse and Compressive Sensing Techniques · Target Tracking and Data Fusion in Sensor Networks