Statistical modelling of an outcome variable with integrated multi-omics
He Li, Zander Gu, Said el Bouhaddani, Jeanine Houwing-Duistermaat

TL;DR
This paper compares univariate and multivariate methods for integrating multi-omics data to model an outcome variable, showing that multivariate approaches often perform better.
Contribution
The paper introduces and evaluates two new multivariate methods for integrating multi-omics data in outcome modeling.
Findings
Multivariate methods outperform univariate methods when modeling outcomes from two normally distributed omics datasets.
All methods perform similarly in real data applications involving metabolomics and genetic datasets.
Multivariate methods remain effective even with non-normal data, offering a promising alternative to high-dimensional approaches.
Abstract
In studies that aim to model the relationship between an outcome variable and multiple omics datasets, it is often desirable to reduce the dimensionality of these datasets or to represent one omics dataset in terms of another. Several approaches exist for this purpose, including univariate methods such as polygenic scores, and multivariate methods. Multivariate approaches offer advantages by producing lower-dimensional integrative scores, capturing joint structures across datasets, and filtering out dataset-specific noise. In this paper, we describe one univariate and two multivariate methods, and evaluate their performance through simulations involving two correlated multivariate normally distributed omics datasets, as well as a combination of one multivariate normal and one fixed categorical dataset. We assess method performance using the root mean squared error (RMSE) when modelling…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMetabolomics and Mass Spectrometry Studies · Health, Environment, Cognitive Aging · Gene expression and cancer classification
