On Sparse Canonical Correlation Analysis

Yongchun Li; Santanu S. Dey; Weijun Xie

arXiv:2401.00308·math.OC·January 2, 2024·NeurIPS·1 cites

On Sparse Canonical Correlation Analysis

Yongchun Li, Santanu S. Dey, Weijun Xie

PDF

Open Access 1 Video

TL;DR

This paper investigates Sparse Canonical Correlation Analysis (SCCA), offering new formulations and algorithms to improve interpretability and address computational challenges in high-dimensional data analysis.

Contribution

It introduces a combinatorial formulation and a mixed-integer semidefinite programming model for SCCA, along with complexity analysis and efficient algorithms.

Findings

01

Proposed approximation algorithms for SCCA.

02

Validated effectiveness through numerical experiments.

03

Established complexity results for special cases.

Abstract

The classical Canonical Correlation Analysis (CCA) identifies the correlations between two sets of multivariate variables based on their covariance, which has been widely applied in diverse fields such as computer vision, natural language processing, and speech analysis. Despite its popularity, CCA can encounter challenges in explaining correlations between two variable sets within high-dimensional data contexts. Thus, this paper studies Sparse Canonical Correlation Analysis (SCCA) that enhances the interpretability of CCA. We first show that SCCA generalizes three well-known sparse optimization problems, sparse PCA, sparse SVD, and sparse regression, which are all classified as NP-hard problems. This result motivates us to develop strong formulations and efficient algorithms. Our main contributions include (i) the introduction of a combinatorial formulation that captures the essence of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On Sparse Canonical Correlation Analysis· slideslive

Taxonomy

TopicsComputational Drug Discovery Methods · Bioinformatics and Genomic Networks · Gene expression and cancer classification