DOME: Recommendations for supervised machine learning validation in   biology

Ian Walsh; Dmytro Fishman; Dario Garcia-Gasulla; Tiina Titma; Gianluca; Pollastri; The ELIXIR Machine Learning focus group; Jen Harrow; Fotis E.; Psomopoulos; Silvio C.E. Tosatto

arXiv:2006.16189·q-bio.OT·January 8, 2021

DOME: Recommendations for supervised machine learning validation in biology

Ian Walsh, Dmytro Fishman, Dario Garcia-Gasulla, Tiina Titma, Gianluca, Pollastri, The ELIXIR Machine Learning focus group, Jen Harrow, Fotis E., Psomopoulos, Silvio C.E. Tosatto

PDF

TL;DR

This paper proposes a structured set of community-wide recommendations called DOME to standardize validation practices for supervised machine learning in biology, enhancing transparency and assessment.

Contribution

It introduces the DOME framework, a structured methods description to improve understanding and evaluation of machine learning models in biological research.

Findings

01

Promotes structured validation questions for ML methods

02

Encourages inclusion of validation details in supplementary materials

03

Aims to improve reproducibility and assessment in biological ML studies

Abstract

Modern biology frequently relies on machine learning to provide predictions and improve decision processes. There have been recent calls for more scrutiny on machine learning performance and possible limitations. Here we present a set of community-wide recommendations aiming to help establish standards of supervised machine learning validation in biology. Adopting a structured methods description for machine learning based on data, optimization, model, evaluation (DOME) will aim to help both reviewers and readers to better understand and assess the performance and limitations of a method or outcome. The recommendations are formulated as questions to anyone wishing to pursue implementation of a machine learning algorithm. Answers to these questions can be easily included in the supplementary material of published papers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.