Subspace approximation with outliers

Amit Deshpande; Rameshwar Pratap

arXiv:2006.16573·cs.CG·July 1, 2020

Subspace approximation with outliers

Amit Deshpande, Rameshwar Pratap

PDF

TL;DR

This paper introduces an efficient algorithm for subspace approximation with outliers, achieving near-optimal solutions under certain error assumptions, even with a high fraction of outliers, by extending dimension reduction and sampling techniques.

Contribution

It extends dimension reduction and sampling methods to handle outliers in subspace approximation, overcoming the SSE-hardness of robust subspace recovery under specific error conditions.

Findings

01

Provides a polynomial-time algorithm with linear dependence on n and d.

02

Achieves a (1+ε)-approximation for the optimal subspace.

03

Works even with large outlier fractions under certain error assumptions.

Abstract

The subspace approximation problem with outliers, for given $n$ points in $d$ dimensions $x_{1}, \dots, x_{n} \in R^{d}$ , an integer $1 \leq k \leq d$ , and an outlier parameter $0 \leq α \leq 1$ , is to find a $k$ -dimensional linear subspace of $R^{d}$ that minimizes the sum of squared distances to its nearest $(1 - α) n$ points. More generally, the $ℓ_{p}$ subspace approximation problem with outliers minimizes the sum of $p$ -th powers of distances instead of the sum of squared distances. Even the case of robust PCA is non-trivial, and previous work requires additional assumptions on the input. Any multiplicative approximation algorithm for the subspace approximation problem with outliers must solve the robust subspace recovery problem, a special case in which the $(1 - α) n$ inliers in the optimal solution are promised to lie exactly on a $k$ -dimensional linear subspace.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPrincipal Components Analysis