Bayesian Multi-study Factor Analysis for High-throughput Biological Data
Roberta De Vito, Ruggero Bellio, Lorenzo Trippa, Giovanni, Parmigiani

TL;DR
This paper introduces a Bayesian multi-study factor analysis method that identifies shared and study-specific factors in high-throughput biological data, demonstrating superior performance and revealing meaningful gene patterns in breast cancer studies.
Contribution
It extends Multi-study Factor Analysis with a Bayesian sparse approach and develops novel identification and computational techniques for high-dimensional biological data.
Findings
Outperforms standard factor analysis in simulations
Identifies replicable gene patterns in breast cancer data
Provides an efficient Gibbs sampling algorithm
Abstract
This paper presents a new modeling strategy for joint unsupervised analysis of multiple high-throughput biological studies. As in Multi-study Factor Analysis, our goals are to identify both common factors shared across studies and study-specific factors. Our approach is motivated by the growing body of high-throughput studies in biomedical research, as exemplified by the comprehensive set of expression data on breast tumors considered in our case study. To handle high-dimensional studies, we extend Multi-study Factor Analysis using a Bayesian approach that imposes sparsity. Specifically, we generalize the sparse Bayesian infinite factor model to multiple studies. We also devise novel solutions for the identification of the loading matrices: we recover the loading matrices of interest ex-post, by adapting the orthogonal Procrustes approach. Computationally, we propose an efficient and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Bioinformatics and Genomic Networks · Genomics and Chromatin Dynamics
