Machine Learning for Integrating Data in Biology and Medicine:   Principles, Practice, and Opportunities

Marinka Zitnik; Francis Nguyen; Bo Wang; Jure Leskovec; Anna; Goldenberg; Michael M. Hoffman

arXiv:1807.00123·q-bio.QM·October 22, 2018

Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities

Marinka Zitnik, Francis Nguyen, Bo Wang, Jure Leskovec, Anna, Goldenberg, Michael M. Hoffman

PDF

TL;DR

This paper reviews principles, methods, and challenges of integrating diverse biological and medical data types to better understand complex health phenomena and improve predictive modeling.

Contribution

It provides a comprehensive overview of current data integration techniques, their applications, and future challenges in biology and medicine.

Findings

01

Successful examples of data integration in biomedical research

02

Current methods effectively combine heterogeneous data types

03

Identified challenges and future directions for integrative approaches

Abstract

New technologies have enabled the investigation of biology and human health at an unprecedented scale and in multiple dimensions. These dimensions include a myriad of properties describing genome, epigenome, transcriptome, microbiome, phenotype, and lifestyle. No single data type, however, can capture the complexity of all the factors relevant to understanding a phenomenon such as a disease. Integrative methods that combine data from multiple technologies have thus emerged as critical statistical and computational approaches. The key challenge in developing such approaches is the identification of effective models to provide a comprehensive and relevant systems view. An ideal method can answer a biological or medical question, identifying important features and predicting outcomes, by harnessing heterogeneous data across several dimensions of biological variation. In this Review, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.