Scalable Randomized Kernel Methods for Multiview Data Integration and Prediction
Sandra E. Safo, Han Lu

TL;DR
This paper introduces scalable randomized kernel methods for integrating multiview data and predicting outcomes, effectively capturing nonlinear relationships and identifying key variables, with applications to COVID-19 molecular data.
Contribution
The paper presents a novel scalable approach using randomized Fourier bases for nonlinear multiview data integration and outcome prediction, suitable for small sample sizes.
Findings
Outperforms existing linear and nonlinear methods in simulations
Identifies molecular signatures associated with COVID-19 severity
Effective for small sample size problems
Abstract
We develop scalable randomized kernel methods for jointly associating data from multiple sources and simultaneously predicting an outcome or classifying a unit into one of two or more classes. The proposed methods model nonlinear relationships in multiview data together with predicting a clinical outcome and are capable of identifying variables or groups of variables that best contribute to the relationships among the views. We use the idea that random Fourier bases can approximate shift-invariant kernel functions to construct nonlinear mappings of each view and we use these mappings and the outcome variable to learn view-independent low-dimensional representations. Through simulation studies, we show that the proposed methods outperform several other linear and nonlinear methods for multiview data integration. When the proposed methods were applied to gene expression, metabolomics,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Liver Disease Diagnosis and Treatment · Machine Learning in Bioinformatics
