Cauchy robust principal component analysis with applications to high-deimensional data sets
Ayisha Fayomi, Yannis Pantazis, Michail Tsagris, and Andrew T.A. Wood

TL;DR
This paper introduces a robust PCA method using a multivariate Cauchy likelihood, improving robustness to outliers in high-dimensional data sets, with an algorithm and theoretical analysis demonstrating its effectiveness.
Contribution
It proposes a novel robust PCA formulation based on Cauchy likelihood, along with an algorithm and influence function analysis, advancing robustness in high-dimensional PCA.
Findings
Outperforms existing robust PCA methods in simulations
Provides a new influence function analysis for robust PCA
Demonstrates effectiveness on high-dimensional datasets
Abstract
Principal component analysis (PCA) is a standard dimensionality reduction technique used in various research and applied fields. From an algorithmic point of view, classical PCA can be formulated in terms of operations on a multivariate Gaussian likelihood. As a consequence of the implied Gaussian formulation, the principal components are not robust to outliers. In this paper, we propose a modified formulation, based on the use of a multivariate Cauchy likelihood instead of the Gaussian likelihood, which has the effect of robustifying the principal components. We present an algorithm to compute these robustified principal components. We additionally derive the relevant influence function of the first component and examine its theoretical properties. Simulation experiments on high-dimensional datasets demonstrate that the estimated principal components based on the Cauchy likelihood…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpectroscopy and Chemometric Analyses · Advanced Statistical Methods and Models · Remote-Sensing Image Classification
