Being Robust (in High Dimensions) Can Be Practical

Ilias Diakonikolas; Gautam Kamath; Daniel M. Kane; Jerry Li; Ankur; Moitra; Alistair Stewart

arXiv:1703.00893·cs.LG·March 14, 2018·72 cites

Being Robust (in High Dimensions) Can Be Practical

Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur, Moitra, Alistair Stewart

PDF

Open Access 2 Repos

TL;DR

This paper demonstrates that high-dimensional robust estimation can be practical by providing optimal sample complexity bounds, algorithmic refinements for higher corruption tolerance, and empirical validation showing state-of-the-art performance.

Contribution

It introduces refined algorithms with improved corruption tolerance and optimal sample complexity bounds, making high-dimensional robust estimation feasible in practice.

Findings

01

Algorithms achieve state-of-the-art performance on synthetic data.

02

Sample complexity bounds are nearly optimal up to logarithmic factors.

03

Enhanced robustness allows tolerating larger fractions of corruptions.

Abstract

Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time algorithms that can tolerate a constant fraction of corruptions, independent of the dimension. However, the sample and time complexity of these algorithms is prohibitively large for high-dimensional applications. In this work, we address both of these issues by establishing sample complexity bounds that are optimal, up to logarithmic factors, as well as giving various refinements that allow the algorithms to tolerate a much larger fraction of corruptions. Finally, we show on both synthetic and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Advanced Statistical Methods and Models · Machine Learning and Algorithms