Sparse canonical correlation analysis from a predictive point of view
Ines Wilms, Christophe Croux

TL;DR
This paper introduces a sparse CCA method that enhances interpretability and performance in high-dimensional data by framing CCA as a regression problem with lasso penalties, demonstrated on genomic data.
Contribution
It recasts CCA into a regression framework with sparsity-inducing penalties, improving interpretability and applicability in high-dimensional settings.
Findings
Outperforms existing sparse CCA methods in simulations
Effective in high-dimensional genomic data analysis
Increases interpretability of canonical variates
Abstract
Canonical correlation analysis (CCA) describes the associations between two sets of variables by maximizing the correlation between linear combinations of the variables in each data set. However, in high-dimensional settings where the number of variables exceeds the sample size or when the variables are highly correlated, traditional CCA is no longer appropriate. This paper proposes a method for sparse CCA. Sparse estimation produces linear combinations of only a subset of variables from each data set, thereby increasing the interpretability of the canonical variates. We consider the CCA problem from a predictive point of view and recast it into a regression framework. By combining an alternating regression approach together with a lasso penalty, we induce sparsity in the canonical vectors. We compare the performance with other sparse CCA techniques in different simulation settings and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
