Inferring Covariance Structure from Multiple Data Sources via Subspace Factor Analysis
Noirrit Kiran Chandra, David B. Dunson, Jason Xu

TL;DR
This paper introduces SUFA, a Bayesian subspace factor analysis model that effectively distinguishes shared and condition-specific covariance structures across multiple high-dimensional datasets, with proven identifiability and efficient computation.
Contribution
The paper proposes SUFA models that ensure identifiability of shared and specific covariance components and develop scalable Bayesian inference algorithms for multi-source data integration.
Findings
Successfully separates shared and condition-specific covariance structures.
Provides scalable, fully integrated Bayesian inference algorithms.
Demonstrates effectiveness on gene expression datasets in immunology.
Abstract
Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under different conditions, for instance in reproducing studies across research groups. In such cases, it is natural to seek to learn the shared versus condition-specific structure. Existing hierarchical extensions of factor analysis have been proposed, but face practical issues including identifiability problems. To address these shortcomings, we propose a class of SUbspace Factor Analysis (SUFA) models, which characterize variation across groups at the level of a lower-dimensional subspace. We prove that the proposed class of SUFA models lead to identifiability of the shared versus group-specific components of the covariance, and study their posterior contraction properties.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Single-cell and spatial transcriptomics
