Sparse canonical correlation analysis from a predictive point of view

Ines Wilms; Christophe Croux

arXiv:1501.01231·stat.ME·January 7, 2015

Sparse canonical correlation analysis from a predictive point of view

Ines Wilms, Christophe Croux

PDF

TL;DR

This paper introduces a sparse CCA method that enhances interpretability and performance in high-dimensional data by framing CCA as a regression problem with lasso penalties, demonstrated on genomic data.

Contribution

It recasts CCA into a regression framework with sparsity-inducing penalties, improving interpretability and applicability in high-dimensional settings.

Findings

01

Outperforms existing sparse CCA methods in simulations

02

Effective in high-dimensional genomic data analysis

03

Increases interpretability of canonical variates

Abstract

Canonical correlation analysis (CCA) describes the associations between two sets of variables by maximizing the correlation between linear combinations of the variables in each data set. However, in high-dimensional settings where the number of variables exceeds the sample size or when the variables are highly correlated, traditional CCA is no longer appropriate. This paper proposes a method for sparse CCA. Sparse estimation produces linear combinations of only a subset of variables from each data set, thereby increasing the interpretability of the canonical variates. We consider the CCA problem from a predictive point of view and recast it into a regression framework. By combining an alternating regression approach together with a lasso penalty, we induce sparsity in the canonical vectors. We compare the performance with other sparse CCA techniques in different simulation settings and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.