A Framework for Implementing Machine Learning on Omics Data
Geoffroy Dubourg-Felonneau, Timothy Cannings, Fergal Cotter, Hannah, Thompson, Nirmesh Patel, John W Cassidy, Harry W Clifford

TL;DR
This paper introduces a framework for integrating and analyzing high-dimensional -omics data to enhance machine learning applications in clinical research, demonstrated through breast cancer survival prediction.
Contribution
The authors present a novel framework for combining diverse -omics datasets and handling high dimensionality, improving machine learning analysis in clinical settings.
Findings
Successful integration of multi-analyte breast cancer data
Improved accuracy in survival prediction
Lower variance compared to individual dataset models
Abstract
The potential benefits of applying machine learning methods to -omics data are becoming increasingly apparent, especially in clinical settings. However, the unique characteristics of these data are not always well suited to machine learning techniques. These data are often generated across different technologies in different labs, and frequently with high dimensionality. In this paper we present a framework for combining -omics data sets, and for handling high dimensional data, making -omics research more accessible to machine learning applications. We demonstrate the success of this framework through integration and analysis of multi-analyte data for a set of 3,533 breast cancers. We then use this data-set to predict breast cancer patient survival for individuals at risk of an impending event, with higher accuracy and lower variance than methods trained on individual data-sets. We hope…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Gene expression and cancer classification · Gene Regulatory Network Analysis
