A Bayesian Methodology for Estimation for Sparse Canonical Correlation
Siddhesh Kulkarni, Subhadip Pal, Jeremy T. Gaskins

TL;DR
This paper introduces a Bayesian approach to Sparse Canonical Correlation Analysis (CCA) using infinite factor models and hierarchical priors to improve robustness and sparsity in high-dimensional multi-omics data analysis.
Contribution
It develops a novel Bayesian methodology for Structured Sparse CCA employing hierarchical priors and infinite factor models, addressing the need for Bayesian tools in this area.
Findings
The proposed Bayesian method outperforms traditional CCA procedures in simulations.
Application to breast cancer multi-omics data demonstrates practical utility.
Hierarchical priors effectively induce sparsity at multiple model levels.
Abstract
It can be challenging to perform an integrative statistical analysis of multi-view high-dimensional data acquired from different experiments on each subject who participated in a joint study. Canonical Correlation Analysis (CCA) is a statistical procedure for identifying relationships between such data sets. In that context, Structured Sparse CCA (ScSCCA) is a rapidly emerging methodological area that aims for robust modeling of the interrelations between the different data modalities by assuming the corresponding CCA directional vectors to be sparse. Although it is a rapidly growing area of statistical methodology development, there is a need for developing related methodologies in the Bayesian paradigm. In this manuscript, we propose a novel ScSCCA approach where we employ a Bayesian infinite factor model and aim to achieve robust estimation by encouraging sparsity in two different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Genetic Mapping and Diversity in Plants and Animals · Bioinformatics and Genomic Networks
