Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis

Kisung You

arXiv:2605.20681·stat.ME·May 21, 2026

Scale-Calibrated Median-of-Means for Robust Distributed Principal Component Analysis

Kisung You

PDF

TL;DR

This paper introduces a scale-calibrated median-of-means estimator for robust distributed PCA, effectively handling heterogeneity and providing reliable subspace estimation in large-scale data.

Contribution

It develops a novel median-of-means approach on the product manifold for distributed PCA, with explicit scale calibration and theoretical guarantees.

Findings

01

The estimator achieves fixed-node non-Gaussian limits and growing-node Gaussian limits.

02

Proposed calibration rules improve robustness and inference accuracy.

03

Simulations and RNA-seq data demonstrate effective adaptation to eigengap-driven uncertainty.

Abstract

Distributed principal component analysis (PCA) produces node-level estimates of both a mean vector and a principal subspace. Robustly aggregating these heterogeneous objects requires a relative scale between mean error and subspace error. We study a scale-calibrated median-of-means estimator for this problem using the product geometry of Euclidean space and the Grassmann manifold. A node-level PCA expansion shows that the mean component has the usual linear influence, whereas the subspace component is an eigengap-weighted covariance perturbation. We prove a local reduction showing that the proposed product-manifold median-of-means estimator is asymptotically equivalent to a scaled spatial median of node influence errors. This yields fixed-node non-Gaussian limits, growing-node Gaussian limits with finite-block bias, and an explicit scale-dependent covariance formula. We propose robust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.