# PCA in Data-Dependent Noise (Correlated-PCA): Nearly Optimal Finite   Sample Guarantees

**Authors:** Namrata Vaswani, Praneeth Narayanamurthy

arXiv: 1702.03070 · 2017-11-01

## TL;DR

This paper analyzes PCA performance when noise depends on the data, providing nearly optimal finite-sample guarantees under weaker correlation assumptions than previous work.

## Contribution

It introduces a nearly optimal sample complexity bound for correlated-PCA with weaker assumptions on data-noise correlation.

## Key findings

- Improved sample complexity bounds for correlated-PCA.
- Weaker assumptions on data-noise correlation compared to prior work.
- Nearly optimal guarantees for PCA with data-dependent noise.

## Abstract

We study Principal Component Analysis (PCA) in a setting where a part of the corrupting noise is data-dependent and, as a result, the noise and the true data are correlated. Under a bounded-ness assumption on the true data and the noise, and a simple assumption on data-noise correlation, we obtain a nearly optimal sample complexity bound for the most commonly used PCA solution, singular value decomposition (SVD). This bound is a significant improvement over the bound obtained by Vaswani and Guo in recent work (NIPS 2016) where this "correlated-PCA" problem was first studied; and it holds under a significantly weaker data-noise correlation assumption than the one used for this earlier result.

---
Source: https://tomesphere.com/paper/1702.03070