Bayesian Combinatorial Multi-Study Factor Analysis
Isabella N. Grabski, Roberta De Vito, Lorenzo Trippa, Giovanni, Parmigiani

TL;DR
Tetris is a Bayesian method that extends multi-study factor analysis to identify latent factors shared by any combination of studies, improving dimension reduction and covariance estimation in high-dimensional data.
Contribution
It introduces a novel Bayesian approach using an Indian Buffet Process to model partially shared latent factors across multiple studies.
Findings
Successfully identifies shared and study-specific factors in simulations
Enhances covariance estimation accuracy
Reveals gene expression patterns in breast cancer datasets
Abstract
Analyzing multiple studies allows leveraging data from a range of sources and populations, but until recently, there have been limited methodologies to approach the joint unsupervised analysis of multiple high-dimensional studies. A recent method, Bayesian Multi-Study Factor Analysis (BMSFA), identifies latent factors common to all studies, as well as latent factors specific to individual studies. However, BMSFA does not allow for partially shared factors, i.e. latent factors shared by more than one but less than all studies. We extend BMSFA by introducing a new method, Tetris, for Bayesian combinatorial multi-study factor analysis, which identifies latent factors that can be shared by any combination of studies. We model the subsets of studies that share latent factors with an Indian Buffet Process. We test our method with an extensive range of simulations, and showcase its utility not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Bayesian Methods and Mixture Models
