Outlier-robust Mean Estimation near the Breakdown Point via Sum-of-Squares
Hongjie Chen, Deepak Narayanan Sridharan, David Steurer

TL;DR
This paper demonstrates that a sum-of-squares based approach can efficiently and optimally estimate the mean of high-dimensional distributions even when nearly half of the data are adversarial outliers.
Contribution
The paper provides a new analysis of the sum-of-squares program, achieving optimal error rates for robust mean estimation near the breakdown point, with a novel identifiability proof.
Findings
Achieves optimal error rate for all outlier fractions below 50%.
Provides a new identifiability proof based on distribution overlap.
Efficient algorithms derived from sum-of-squares proofs.
Abstract
We revisit the problem of estimating the mean of a high-dimensional distribution in the presence of an -fraction of adversarial outliers. When is at most some sufficiently small constant, previous works can achieve optimal error rate efficiently \cite{diakonikolas2018robustly, kothari2018robust}. As approaches the breakdown point , all previous algorithms incur either sub-optimal error rates or exponential running time. In this paper we give a new analysis of the canonical sum-of-squares program introduced in \cite{kothari2018robust} and show that this program efficiently achieves optimal error rate for all . The key ingredient for our results is a new identifiability proof for robust mean estimation that focuses on the overlap between the distributions instead of their statistical distance as in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Advanced Statistical Process Monitoring · Probabilistic and Robust Engineering Design
