Algorithms for Heavy-Tailed Statistics: Regression, Covariance Estimation, and Beyond
Yeshwanth Cherapanamjeri, Samuel B. Hopkins, Tarun Kathuria, Prasad, Raghavendra, Nilesh Tripuraneni

TL;DR
This paper develops polynomial-time algorithms for heavy-tailed statistical problems like covariance estimation and linear regression, achieving near-optimal error bounds under minimal distributional assumptions.
Contribution
It introduces new polynomial-time estimators that match information-theoretic error bounds for heavy-tailed data, improving upon previous computationally inefficient methods.
Findings
Spectral norm covariance estimation error: O(d^{3/4}/\u221a n)
Linear regression loss: O(d/n) with high probability
Algorithms based on degree-8 sum-of-squares semidefinite programs
Abstract
We study efficient algorithms for linear regression and covariance estimation in the absence of Gaussian assumptions on the underlying distributions of samples, making assumptions instead about only finitely-many moments. We focus on how many samples are needed to do estimation and regression with high accuracy and exponentially-good success probability. For covariance estimation, linear regression, and several other problems, estimators have recently been constructed with sample complexities and rates of error matching what is possible when the underlying distribution is Gaussian, but algorithms for these estimators require exponential time. We narrow the gap between the Gaussian and heavy-tailed settings for polynomial-time estimators with: 1. A polynomial-time estimator which takes samples from a random vector with covariance and produces …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
