Data Integration Via Analysis of Subspaces (DIVAS)
Jack B. Prothero, Meilei Jiang, Jan Hannig, Quoc Tran-Dinh, Andrew, Ackerman, J. S. Marron

TL;DR
DIVAS is a novel method for integrating multi-omics data by analyzing subspace structures, effectively identifying shared and distinct features across data types, especially in high-dimensional, low-sample-size scenarios.
Contribution
The paper introduces DIVAS, a new algorithm combining subspace perturbation theory and convex optimization for multi-block data integration with built-in inference.
Findings
Effective in high-dimensional, low-sample-size settings
Identifies partially-shared structures across data types
Provides statistical inference on subspace relationships
Abstract
Modern data collection in many data paradigms, including bioinformatics, often incorporates multiple traits derived from different data types (i.e. platforms). We call this data multi-block, multi-view, or multi-omics data. The emergent field of data integration develops and applies new methods for studying multi-block data and identifying how different data types relate and differ. One major frontier in contemporary data integration research is methodology that can identify partially-shared structure between sub-collections of data types. This work presents a new approach: Data Integration Via Analysis of Subspaces (DIVAS). DIVAS combines new insights in angular subspace perturbation theory with recent developments in matrix signal processing and convex-concave optimization into one algorithm for exploring partially-shared structure. Based on principal angles between subspaces, DIVAS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Gene Regulatory Network Analysis
