Algorithms and Hardness for Robust Subspace Recovery

Moritz Hardt; Ankur Moitra

arXiv:1211.1041·cs.CC·December 5, 2013·32 cites

Algorithms and Hardness for Robust Subspace Recovery

Moritz Hardt, Ankur Moitra

PDF

Open Access

TL;DR

This paper introduces an efficient algorithm for robust subspace recovery that works when the inliers form a majority and proves that surpassing this threshold is computationally hard, highlighting a fundamental trade-off.

Contribution

The paper presents a novel algorithm for subspace recovery that is both robust to adversarial outliers and computationally efficient, with proven optimality under certain conditions.

Findings

01

Algorithm recovers subspace with more than a d/n fraction of inliers.

02

Proves small set expansion hardness for higher outlier fractions.

03

Establishes a fundamental efficiency-robustness trade-off in subspace recovery.

Abstract

We consider a fundamental problem in unsupervised learning called \emph{subspace recovery}: given a collection of $m$ points in $R^{n}$ , if many but not necessarily all of these points are contained in a $d$ -dimensional subspace $T$ can we find it? The points contained in $T$ are called {\em inliers} and the remaining points are {\em outliers}. This problem has received considerable attention in computer science and in statistics. Yet efficient algorithms from computer science are not robust to {\em adversarial} outliers, and the estimators from robust statistics are hard to compute in high dimensions. Are there algorithms for subspace recovery that are both robust to outliers and efficient? We give an algorithm that finds $T$ when it contains more than a $\frac{d}{n}$ fraction of the points. Hence, for say $d = n /2$ this estimator is both easy to compute and well-behaved when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Machine Learning and Algorithms · Integrated Circuits and Semiconductor Failure Analysis