Loss minimization and parameter estimation with heavy tails
Daniel Hsu, Sivan Sabato

TL;DR
This paper introduces a robust estimation technique effective under heavy-tailed distributions, enabling near-optimal parameter estimation for various models without requiring bounded or subgaussian data.
Contribution
It generalizes the median-of-means estimator to arbitrary metric spaces and applies it to minimize convex losses and estimate parameters in heavy-tailed settings.
Findings
Requires only O(d log(1/δ)) samples for near-optimal least squares estimation.
Applicable to sparse linear regression and low-rank covariance matrix estimation.
Does not assume bounded or subgaussian covariates or noise.
Abstract
This work studies applications and generalizations of a simple estimation technique that provides exponential concentration under heavy-tailed distributions, assuming only bounded low-order moments. We show that the technique can be used for approximate minimization of smooth and strongly convex losses, and specifically for least squares linear regression. For instance, our -dimensional estimator requires just random samples to obtain a constant factor approximation to the optimal least squares loss with probability , without requiring the covariates or noise to be bounded or subgaussian. We provide further applications to sparse linear regression and low-rank covariance matrix estimation with similar allowances on the noise and covariate distributions. The core technique is a generalization of the median-of-means estimator to arbitrary metric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Distributed Sensor Networks and Detection Algorithms
MethodsLinear Regression
