MM-PCA: Integrative Analysis of Multi-group and Multi-view Data
Jonatan Kallus, Patrik Johansson, Sven Nelander, Rebecka J\"ornsten

TL;DR
MM-PCA is a novel data adaptive method for integrating multi-group and multi-view data, identifying shared and distinct structures to improve analysis of heterogeneous datasets, demonstrated on simulated and cancer genomics data.
Contribution
Introduces MM-PCA, a new method for flexible, structure-aware integration of multi-group and multi-view data, allowing for partial sharing and bi-clustering.
Findings
Successfully applied to simulated data demonstrating effectiveness.
Analyzed TCGA omics data revealing biologically meaningful structures.
Provides an R-package for practical implementation.
Abstract
Data integration is the problem of combining multiple data groups (studies, cohorts) and/or multiple data views (variables, features). This task is becoming increasingly important in many disciplines due to the prevalence of large and heterogeneous data sets. Data integration commonly aims to identify structure that is consistent across multiple cohorts and feature sets. While such joint analyses can boost information from single data sets, it is also possible that a globally restrictive integration of heterogeneous data may obscure signal of interest. Here, we therefore propose a data adaptive integration method, allowing for structure in data to be shared across an a priori unknown \emph{subset of cohorts and views}. The method, Multi-group Multi-view Principal Component Analysis (MM-PCA), identifies partially shared, sparse low-rank components. This also results in an integrative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Metabolomics and Mass Spectrometry Studies
