Leveraging genomic deep learning models for the prediction of non-coding variant effects
Pooja Kathail, Ayesha Bajwa, Nilah M. Ioannidis

TL;DR
This review discusses how genomic deep learning models, including supervised and self-supervised approaches, are used to predict the effects of non-coding genetic variants, highlighting current progress, challenges, and future opportunities.
Contribution
It provides a comprehensive overview of the state-of-the-art in leveraging deep learning models for non-coding variant effect prediction, including practical considerations and evaluation strategies.
Findings
Deep learning models can predict molecular phenotypes from DNA sequences.
Evaluation of models depends on ground truth data types.
Current models are most useful in specific genomic contexts.
Abstract
Characterizing non-coding variant function remains an important challenge in human genetics. Genomic deep learning models have emerged as a promising approach to enable in silico prediction of variant effects. These include supervised sequence-to-activity models, which predict molecular phenotypes such as genome-wide chromatin states or gene expression levels directly from DNA sequence, and self-supervised genomic language models. Here, we review progress in leveraging these models for non-coding variant effect prediction. We describe practical considerations for making such predictions and categorize the types of ground truth data used to evaluate variant effect predictions, providing insight into the settings in which current models are most useful. Our Review highlights key considerations for practitioners and opportunities for improvement in model development and evaluation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetics, Bioinformatics, and Biomedical Research
