Heavy-Tailed Principal Component Analysis
Mario Sayde, Christopher Khater, Jihad Fahs, Ibrahim Abou-Faycal

TL;DR
This paper introduces a robust PCA framework for heavy-tailed data using a logarithmic loss, showing that principal components align with Gaussian-based PCA and outperform classical methods in noisy, heavy-tailed scenarios.
Contribution
It develops a unified theoretical approach for PCA under heavy-tailed distributions using a logarithmic loss and proposes robust covariance estimators tailored for such data.
Findings
Proposed estimators outperform classical PCA in heavy-tailed noise.
Principal components under heavy tails match Gaussian PCA with the new approach.
Method remains effective even with impulsive noise and infinite variance.
Abstract
Principal Component Analysis (PCA) is a cornerstone of dimensionality reduction, yet its classical formulation relies critically on second-order moments and is therefore fragile in the presence of heavy-tailed data and impulsive noise. While numerous robust PCA variants have been proposed, most either assume finite variance, rely on sparsity-driven decompositions, or address robustness through surrogate loss functions without a unified treatment of infinite-variance models. In this paper, we study PCA for high-dimensional data generated according to a superstatistical dependent model of the form , where is a positive random scalar and is a Gaussian vector. This framework captures a wide class of heavy-tailed distributions, including multivariate and sub-Gaussian -stable laws. We formulate PCA under a logarithmic loss, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
