TL;DR
This paper discusses how incorporating biological structures like sequences, networks, and ontologies into machine learning models enhances biomedical applications by improving interpretability and leveraging prior knowledge.
Contribution
It reviews recent methods that embed biological structure into machine learning models, emphasizing the importance of structured data in biomedical research.
Findings
Structured data improves model interpretability
Incorporating biological priors enhances learning with limited samples
Open source tools and benchmarking are needed for progress
Abstract
In biomedical applications of machine learning, relevant information often has a rich structure that is not easily encoded as real-valued predictors. Examples of such data include DNA or RNA sequences, gene sets or pathways, gene interaction or coexpression networks, ontologies, and phylogenetic trees. We highlight recent examples of machine learning models that use structure to constrain model architecture or incorporate structured data into model training. For machine learning in biomedicine, where sample size is limited and model interpretability is critical, incorporating prior knowledge in the form of structured data can be particularly useful. The area of research would benefit from performant open source implementations and independent benchmarking efforts.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
