Regularised PCA to denoise and visualise data
Marie Verbanck, Julie Josse, Fran\c{c}ois Husson

TL;DR
This paper introduces a regularised PCA method that improves data denoising and visualization by thresholding singular values, leading to better signal recovery especially in noisy conditions.
Contribution
It proposes a novel regularised PCA approach based on singular value thresholding, justified through asymptotic analysis and Bayesian interpretation.
Findings
Regularised PCA outperforms classical PCA in signal recovery.
The method is especially effective with noisy data.
Graphical outputs are improved with regularisation.
Abstract
Principal component analysis (PCA) is a well-established method commonly used to explore and visualise data. A classical PCA model is the fixed effect model where data are generated as a fixed structure of low rank corrupted by noise. Under this model, PCA does not provide the best recovery of the underlying signal in terms of mean squared error. Following the same principle as in ridge regression, we propose a regularised version of PCA that boils down to threshold the singular values. Each singular value is multiplied by a term which can be seen as the ratio of the signal variance over the total variance of the associated dimension. The regularised term is analytically derived using asymptotic results and can also be justified from a Bayesian treatment of the model. Regularised PCA provides promising results in terms of the recovery of the true signal and the graphical outputs in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Neural Networks and Applications · Face and Expression Recognition
