Outlier-Robust Sparse Mean Estimation for Heavy-Tailed Distributions
Ilias Diakonikolas, Daniel M. Kane, Jasper C.H. Lee, Ankit Pensia

TL;DR
This paper introduces the first efficient, robust, and sample-optimal algorithm for sparse mean estimation in high-dimensional heavy-tailed distributions, handling outliers with minimal assumptions.
Contribution
It presents a novel polynomial-time estimator for heavy-tailed distributions that achieves optimal error and sample complexity, extending robust sparse mean estimation to more challenging settings.
Findings
Achieves asymptotically optimal error with logarithmic sample complexity in dimension.
Sample complexity scales optimally with failure probability τ.
Uses a stability-based approach with new matrix decomposition techniques.
Abstract
We study the fundamental task of outlier-robust mean estimation for heavy-tailed distributions in the presence of sparsity. Specifically, given a small number of corrupted samples from a high-dimensional heavy-tailed distribution whose mean is guaranteed to be sparse, the goal is to efficiently compute a hypothesis that accurately approximates with high probability. Prior work had obtained efficient algorithms for robust sparse mean estimation of light-tailed distributions. In this work, we give the first sample-efficient and polynomial-time robust sparse mean estimator for heavy-tailed distributions under mild moment assumptions. Our algorithm achieves the optimal asymptotic error using a number of samples scaling logarithmically with the ambient dimension. Importantly, the sample complexity of our method is optimal as a function of the failure probability , having an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Advanced Statistical Methods and Models
