Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

Ilias Diakonikolas; Daniel M. Kane; Jasper C.H. Lee; Ankit Pensia

arXiv:2211.16333·cs.DS·November 30, 2022

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions

Ilias Diakonikolas, Daniel M. Kane, Jasper C.H. Lee, Ankit Pensia

PDF

Open Access 1 Video

TL;DR

This paper introduces the first efficient, robust, and sample-optimal algorithm for sparse mean estimation in high-dimensional heavy-tailed distributions, handling outliers with minimal assumptions.

Contribution

It presents a novel polynomial-time estimator for heavy-tailed distributions that achieves optimal error and sample complexity, extending robust sparse mean estimation to more challenging settings.

Findings

01

Achieves asymptotically optimal error with logarithmic sample complexity in dimension.

02

Sample complexity scales optimally with failure probability τ.

03

Uses a stability-based approach with new matrix decomposition techniques.

Abstract

We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean $μ$ is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates $μ$ with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability $τ$ , having an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions· slideslive

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Advanced Statistical Methods and Models