D-GCCA: Decomposition-based Generalized Canonical Correlation Analysis   for Multi-view High-dimensional Data

Hai Shu; Zhe Qu; Hongtu Zhu

arXiv:2001.02856·stat.ML·September 19, 2022·5 cites

D-GCCA: Decomposition-based Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data

Hai Shu, Zhe Qu, Hongtu Zhu

PDF

Open Access 1 Repo 1 Video

TL;DR

D-GCCA is a novel decomposition method for multi-view high-dimensional data that separates common and distinctive sources, providing consistent estimation and improved interpretability over existing methods.

Contribution

It introduces a decomposition-based generalized canonical correlation analysis that rigorously defines the model on the L2 space and incorporates orthogonality constraints for better source separation.

Findings

01

D-GCCA achieves consistent low-rank matrix recovery.

02

The method outperforms existing techniques in simulations.

03

Efficient closed-form estimators enable large-scale data analysis.

Abstract

Modern biomedical studies often collect multi-view data, that is, multiple types of data measured on the same set of objects. A popular model in high-dimensional multi-view data analysis is to decompose each view's data matrix into a low-rank common-source matrix generated by latent factors common across all data views, a low-rank distinctive-source matrix corresponding to each view, and an additive noise matrix. We propose a novel decomposition method for this model, called decomposition-based generalized canonical correlation analysis (D-GCCA). The D-GCCA rigorously defines the decomposition on the L2 space of random variables in contrast to the Euclidean dot product space used by most existing methods, thereby being able to provide the estimation consistency for the low-rank matrix recovery. Moreover, to well calibrate common latent factors, we impose a desirable orthogonality…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shu-hai/d-gcca
noneOfficial

Videos

D-GCCA: Decomposition-based Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data· slideslive

Taxonomy

TopicsFunctional Brain Connectivity Studies · Advanced Neuroimaging Techniques and Applications · Gene expression and cancer classification