Debiased distributed PCA under high dimensional spiked model

Weiming Li; Zeng Li; Siyu Wang; Yanqing Yin; Junpeng Zhu

arXiv:2505.22015·stat.ME·May 29, 2025

Debiased distributed PCA under high dimensional spiked model

Weiming Li, Zeng Li, Siyu Wang, Yanqing Yin, Junpeng Zhu

PDF

Open Access

TL;DR

This paper introduces a debiased distributed PCA method that corrects bias in high-dimensional settings, especially effective with few machines and sparse eigenvectors, backed by theoretical guarantees and empirical validation.

Contribution

It proposes a novel debiased distributed PCA algorithm with theoretical consistency under weaker conditions and adaptive sparsity detection, outperforming existing methods.

Findings

01

Achieves smaller estimation error with fewer machines.

02

Effectively handles sparse and non-sparse eigenvectors.

03

Outperforms existing distributed PCA methods in simulations and real data.

Abstract

We study distributed principal component analysis (PCA) in high-dimensional settings under the spiked model. In such regimes, sample eigenvectors can deviate significantly from population ones, introducing a persistent bias. Existing distributed PCA methods are sensitive to this bias, particularly when the number of machines is small. Their consistency typically relies on the number of machines tending to infinity. We propose a debiased distributed PCA algorithm that corrects the local bias before aggregation and incorporates a sparsity-detection step to adaptively handle sparse and non-sparse eigenvectors. Theoretically, we establish the consistency of our estimator under much weaker conditions compared to existing literature. In particular, our approach does not require symmetric innovations and only assumes a finite sixth moment. Furthermore, our method generally achieves smaller…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Brain Tumor Detection and Classification