Sample Complexity Bounds for Robust Mean Estimation with Mean-Shift Contamination
Ilias Diakonikolas, Giannis Iakovidis, Daniel M. Kane, and Sihan Liu

TL;DR
This paper investigates the sample complexity of robust mean estimation under mean-shift contamination for general distributions, providing new algorithms and bounds using Fourier analysis.
Contribution
It introduces a spectral condition-based algorithm and matching lower bounds for mean estimation under mean-shift contamination, extending prior results beyond Gaussian and Laplace distributions.
Findings
Existence of a sample-efficient algorithm under mild spectral conditions
Matching lower bounds on sample complexity
Use of Fourier analysis and Fourier witness concept
Abstract
We study the basic task of mean estimation in the presence of mean-shift contamination. In the mean-shift contamination model, an adversary is allowed to replace a small constant fraction of the clean samples by samples drawn from arbitrarily shifted versions of the base distribution. Prior work characterized the sample complexity of this task for the special cases of the Gaussian and Laplace distributions. Specifically, it was shown that consistent estimation is possible in these cases, a property that is provably impossible in Huber's contamination model. An open question posed in earlier work was to determine the sample complexity of mean estimation in the mean-shift contamination model for general base distributions. In this work, we study and essentially resolve this open question. Specifically, we show that, under mild spectral conditions on the characteristic function of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Sparse and Compressive Sensing Techniques · Gaussian Processes and Bayesian Inference
