Incorporating biological structure into machine learning models in   biomedicine

Jake Crawford; Casey S. Greene

arXiv:1910.06738·q-bio.GN·October 16, 2019

Incorporating biological structure into machine learning models in biomedicine

Jake Crawford, Casey S. Greene

PDF

1 Repo

TL;DR

This paper discusses how incorporating biological structures like sequences, networks, and ontologies into machine learning models enhances biomedical applications by improving interpretability and leveraging prior knowledge.

Contribution

It reviews recent methods that embed biological structure into machine learning models, emphasizing the importance of structured data in biomedical research.

Findings

01

Structured data improves model interpretability

02

Incorporating biological priors enhances learning with limited samples

03

Open source tools and benchmarking are needed for progress

Abstract

In biomedical applications of machine learning, relevant information often has a rich structure that is not easily encoded as real-valued predictors. Examples of such data include DNA or RNA sequences, gene sets or pathways, gene interaction or coexpression networks, ontologies, and phylogenetic trees. We highlight recent examples of machine learning models that use structure to constrain model architecture or incorporate structured data into model training. For machine learning in biomedicine, where sample size is limited and model interpretability is critical, incorporating prior knowledge in the form of structured data can be particularly useful. The area of research would benefit from performant open source implementations and independent benchmarking efforts.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

greenelab/biopriors-review
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.