Advancing regulatory genomics with machine learning
Laurent Br\'eh\'elin

TL;DR
This paper reviews machine learning methods in regulatory genomics, highlighting techniques from linear models to deep learning, and discusses how to extract and validate biological hypotheses from these models.
Contribution
It provides a comprehensive overview of machine learning approaches in regulatory genomics and discusses methods for assessing confidence in biological insights.
Findings
Deep learning models enable complex gene regulation predictions
Different models offer varying confidence measures for hypotheses
Review of strategies for hypothesis extraction and validation
Abstract
In recent years, several machine learning approaches have been proposed to predict gene expression and epigenetic signals from the DNA sequence alone. These models are often used to deduce, and, to some extent, assess putative new biological insights about gene regulation, and they have led to very interesting advances in regulatory genomics. This article reviews a selection of these methods, ranging from linear models to random forests, kernel methods, and more advanced deep learning models. Specifically, we detail the different techniques and strategies that can be used to extract new gene-regulation hypotheses from these models. Furthermore, because these putative insights need to be validated with wet-lab experiments, we emphasize that it is important to have a measure of confidence associated with the extracted hypotheses. We review the procedures that have been proposed to measure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Chromatin Dynamics · Machine Learning in Bioinformatics · RNA and protein synthesis mechanisms
