Self-Paced Probabilistic Principal Component Analysis for Data with   Outliers

Bowen Zhao; Xi Xiao; Wanpeng Zhang; Bin Zhang; Shutao Xia

arXiv:1904.06546·cs.LG·April 16, 2019·1 cites

Self-Paced Probabilistic Principal Component Analysis for Data with Outliers

Bowen Zhao, Xi Xiao, Wanpeng Zhang, Bin Zhang, Shutao Xia

PDF

Open Access

TL;DR

This paper introduces SP-PPCA, a robust variant of probabilistic PCA that incorporates self-paced learning to effectively identify and mitigate the influence of outliers in data analysis.

Contribution

The paper proposes a novel self-paced learning approach integrated into PPCA, enhancing robustness against outliers with an efficient optimization algorithm.

Findings

01

SP-PPCA effectively reduces outlier impact in synthetic and real datasets.

02

The method outperforms standard PPCA in robustness and accuracy.

03

Experimental results validate the effectiveness of the proposed approach.

Abstract

Principal Component Analysis (PCA) is a popular tool for dimensionality reduction and feature extraction in data analysis. There is a probabilistic version of PCA, known as Probabilistic PCA (PPCA). However, standard PCA and PPCA are not robust, as they are sensitive to outliers. To alleviate this problem, this paper introduces the Self-Paced Learning mechanism into PPCA, and proposes a novel method called Self-Paced Probabilistic Principal Component Analysis (SP-PPCA). Furthermore, we design the corresponding optimization algorithm based on the alternative search strategy and the expectation-maximization algorithm. SP-PPCA looks for optimal projection vectors and filters out outliers iteratively. Experiments on both synthetic problems and real-world datasets clearly demonstrate that SP-PPCA is able to reduce or eliminate the impact of outliers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Spectroscopy and Chemometric Analyses · Blind Source Separation Techniques

MethodsPrincipal Components Analysis