Robust self-tuning semiparametric PCA for contaminated elliptical distribution
Hung Hung, Su-Yun Huang, and Shinto Eguchi

TL;DR
This paper introduces a robust, self-tuning semiparametric PCA method that effectively handles contaminated elliptical distributions and outliers, outperforming traditional PCA and Tyler's M-estimator.
Contribution
It proposes a novel semiparametric PCA approach that is robust to both heavy-tailed and non-elliptical outliers, with a data-driven tuning procedure for adaptability.
Findings
The method is robust to heavy-tailed elliptical distributions.
It adapts to different outlier levels via a data-driven tuning.
Simulation and data analysis show its superior performance.
Abstract
Principal component analysis (PCA) is one of the most popular dimension reduction methods. The usual PCA is known to be sensitive to the presence of outliers, and thus many robust PCA methods have been developed. Among them, the Tyler's M-estimator is shown to be the most robust scatter estimator under the elliptical distribution. However, when the underlying distribution is contaminated and deviates from ellipticity, Tyler's M-estimator might not work well. In this article, we apply the semiparametric theory to propose a robust semiparametric PCA. The merits of our proposal are twofold. First, it is robust to heavy-tailed elliptical distributions as well as robust to non-elliptical outliers. Second, it pairs well with a data-driven tuning procedure, which is based on active ratio and can adapt to different degrees of data outlyingness. Theoretical properties are derived, including the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference
