Sparse group principal component analysis via double thresholding with application to multi-cellular programs
Qi Xu, Jing Lei, Kathryn Roeder

TL;DR
This paper introduces SGPCA, an efficient and scalable method for estimating multi-cellular programs from high-dimensional gene expression data, with theoretical guarantees and successful application to Lupus data.
Contribution
We propose SGPCA, a novel sparse group PCA method with a double-thresholding algorithm that improves efficiency, statistical power, and theoretical guarantees for analyzing MCPs.
Findings
SGPCA achieves linear computational complexity $O(np)$.
It demonstrates superior accuracy and power in simulations.
Successfully identifies differential MCPs in Lupus study.
Abstract
Multi-cellular programs (MCPs) are coordinated patterns of gene expression across interacting cell types that collectively drive complex biological processes such as tissue development and immune responses. While MCPs are typically estimated from high-dimensional gene expression data using methods like sparse principal component analysis or latent factor models, these approaches often suffer from high computational costs and limited statistical power. In this work, we propose Sparse Group Principal Component Analysis (SGPCA) to estimate MCPs by leveraging their inherent group and individual sparsity. We introduce an efficient double-thresholding algorithm based on power iteration. In each iteration, a group thresholding step first identifies relevant gene groups, followed by an individual thresholding step to select active cell types. This algorithm achieves a linear computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Gene Regulatory Network Analysis · Advanced Bandit Algorithms Research
