Near-Optimal Streaming Heavy-Tailed Statistical Estimation with Clipped SGD
Aniket Das, Dheeraj Nagaraj, Soumyabrata Pal, Arun Suggala, Prateek, Varshney

TL;DR
This paper demonstrates that Clipped-SGD achieves near-optimal statistical rates for high-dimensional heavy-tailed data in streaming settings, extending its effectiveness beyond strongly convex functions.
Contribution
The authors establish near-optimal convergence rates for Clipped-SGD in heavy-tailed streaming data, introducing a new iterative martingale concentration technique.
Findings
Clipped-SGD attains near-optimal sub-Gaussian rates with finite second moment of gradients.
The error bound improves previous rates by reducing dependence on $rac{1}{\delta}$.
Results extend to smooth and Lipschitz convex objectives.
Abstract
We consider the problem of high-dimensional heavy-tailed statistical estimation in the streaming setting, which is much harder than the traditional batch setting due to memory constraints. We cast this problem as stochastic convex optimization with heavy tailed stochastic gradients, and prove that the widely used Clipped-SGD algorithm attains near-optimal sub-Gaussian statistical rates whenever the second moment of the stochastic gradient noise is finite. More precisely, with samples, we show that Clipped-SGD, for smooth and strongly convex objectives, achieves an error of with probability , where is the covariance of the clipped gradient. Note that the fluctuations (depending on ) are of lower order than the term . This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed Sensor Networks and Detection Algorithms · Sparse and Compressive Sensing Techniques · Target Tracking and Data Fusion in Sensor Networks
