TL;DR
This paper reviews and compares five advanced Bayesian integrative factor analysis methods, demonstrating their application to high-dimensional nutrition and genomics data to improve data integration, interpretability, and reproducibility.
Contribution
It provides a practical guide, including R code, to apply and compare five Bayesian integrative factor models for multi-study data analysis in biomedical research.
Findings
Bayesian methods outperform standard FA in accuracy and robustness.
The tutorial includes real-data applications in nutrition and genomics.
Computational efficiency varies across methods, influencing practical choices.
Abstract
High-dimensional data are crucial in biomedical research. Integrating such data from multiple studies is a critical process that relies on the choice of advanced statistical models, enhancing statistical power, reproducibility, and scientific insight compared to analyzing each study separately. Factor analysis (FA) is a core dimensionality reduction technique that models observed data through a small set of latent factors. Bayesian extensions of FA have recently emerged as powerful tools for multi-study integration, enabling researchers to disentangle shared biological signals from study-specific variability. In this tutorial, we provide a practical and comparative guide to five advanced Bayesian integrative factor models: Perturbed Factor Analysis (PFA), Bayesian Factor Regression with non-local spike-and-slab priors (MOM-SS), Subspace Factor Analysis (SUFA), Bayesian Multi-study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
