List Decodable Mean Estimation in Nearly Linear Time

Yeshwanth Cherapanamjeri; Sidhanth Mohanty; Morris Yau

arXiv:2005.09796·cs.DS·January 22, 2021

List Decodable Mean Estimation in Nearly Linear Time

Yeshwanth Cherapanamjeri, Sidhanth Mohanty, Morris Yau

PDF

TL;DR

This paper introduces a nearly linear time algorithm for list decodable mean estimation in high-dimensional data with many outliers, achieving near-optimal recovery and sample complexity.

Contribution

It develops a descent-style algorithm on a nonconvex landscape for list decodable mean estimation, with custom SDP solvers for saddle-point optimization.

Findings

01

Achieves near-optimal recovery of the mean with high probability.

02

Provides a nearly linear time algorithm in the dimension of data.

03

Introduces custom primal-dual SDP solvers for nonconvex optimization.

Abstract

Learning from data in the presence of outliers is a fundamental problem in statistics. Until recently, no computationally efficient algorithms were known to compute the mean of a high dimensional distribution under natural assumptions in the presence of even a small fraction of outliers. In this paper, we consider robust statistics in the presence of overwhelming outliers where the majority of the dataset is introduced adversarially. With only an $α < 1/2$ fraction of "inliers" (clean data) the mean of a distribution is unidentifiable. However, in their influential work, [CSV17] introduces a polynomial time algorithm recovering the mean of distributions with bounded covariance by outputting a succinct list of $O (1/ α)$ candidate solutions, one of which is guaranteed to be close to the true distributional mean; a direct analog of 'List Decoding' in the theory of error correcting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.