Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information
Sandra E. Safo, Shuzhao Li, Qi Long

TL;DR
This paper develops a structured sparse canonical correlation analysis method that incorporates biological network information to identify relevant genes and metabolites associated with cardiovascular disease, improving interpretability and robustness.
Contribution
The paper introduces a novel structured sparse CCA method that integrates biological network information, enhancing variable selection and interpretability in high-dimensional data analysis.
Findings
Structured sparse CCA outperforms existing methods in simulations.
The method identifies known cardiovascular-related genes and pathways.
Results are robust to mis-specified structural information.
Abstract
Integrative analyses of different high dimensional data types are becoming increasingly popular. Similarly, incorporating prior functional relationships among variables in data analysis has been a topic of increasing interest as it helps elucidate underlying mechanisms among complex diseases. In this paper, the goal is to assess association between transcriptomic and metabolomic data from a Predictive Health Institute (PHI) study including healthy adults at high risk of developing cardiovascular diseases. To this end, we develop statistical methods for identifying sparse structure in canonical correlation analysis (CCA) with incorporation of biological/structural information. Our proposed methods use prior network structural information among genes and among metabolites to guide selection of relevant genes and metabolites in sparse CCA, providing insight on the molecular underpinning of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Metabolomics and Mass Spectrometry Studies
