D-GCCA: Decomposition-based Generalized Canonical Correlation Analysis for Multi-view High-dimensional Data
Hai Shu, Zhe Qu, Hongtu Zhu

TL;DR
D-GCCA is a novel decomposition method for multi-view high-dimensional data that separates common and distinctive sources, providing consistent estimation and improved interpretability over existing methods.
Contribution
It introduces a decomposition-based generalized canonical correlation analysis that rigorously defines the model on the L2 space and incorporates orthogonality constraints for better source separation.
Findings
D-GCCA achieves consistent low-rank matrix recovery.
The method outperforms existing techniques in simulations.
Efficient closed-form estimators enable large-scale data analysis.
Abstract
Modern biomedical studies often collect multi-view data, that is, multiple types of data measured on the same set of objects. A popular model in high-dimensional multi-view data analysis is to decompose each view's data matrix into a low-rank common-source matrix generated by latent factors common across all data views, a low-rank distinctive-source matrix corresponding to each view, and an additive noise matrix. We propose a novel decomposition method for this model, called decomposition-based generalized canonical correlation analysis (D-GCCA). The D-GCCA rigorously defines the decomposition on the L2 space of random variables in contrast to the Euclidean dot product space used by most existing methods, thereby being able to provide the estimation consistency for the low-rank matrix recovery. Moreover, to well calibrate common latent factors, we impose a desirable orthogonality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsFunctional Brain Connectivity Studies · Advanced Neuroimaging Techniques and Applications · Gene expression and cancer classification
