# Harnessing artificial intelligence for genomic variant prediction: advances, challenges, and future directions

**Authors:** Indah Pakpahan, Mentari Sihombing, Haohan Liu, Mengyao Wang, Zheng Su, Mingyan Fang

PMC · DOI: 10.1093/gigascience/giag004 · GigaScience · 2026-01-10

## TL;DR

This paper reviews how artificial intelligence is improving the prediction of genetic variants, highlighting progress, challenges, and future strategies for better accuracy in disease research and personalized medicine.

## Contribution

The paper introduces strategies for improving variant prediction through explainable AI and multi-omics integration, emphasizing inclusivity and interpretability.

## Key findings

- Traditional systems are being replaced by machine learning and deep learning in variant prediction.
- Explainable AI and inclusive genomic databases are needed to address variant uncertainty and data heterogeneity.
- Optimized workflows and multi-omics integration can enhance clinical and research variant interpretation.

## Abstract

Accurate genetic variant interpretation is crucial for disease research and the development of targeted therapies. Artificial intelligence is transforming this field by integrating computational methodologies across structural biology, evolutionary analysis, and multimodal genomic data. This review examines the evolution from traditional rule-based systems and statistical models to contemporary machine learning, deep learning, and protein language models, while addressing critical challenges in variant classification. Key obstacles include data heterogeneity, interpretability, and the persistence of variants of uncertain significance, emphasizing the critical need for explainable artificial intelligence frameworks and more inclusive genomic databases to improve predictive accuracy across diverse populations. Based on the assessment of current variant impact predictors, we propose strategies for enhanced predictor selection, effective multi-omics data integration, and optimized computational workflows. These recommendations aim to enhance variant interpretation accuracy in both research settings and clinical practice, ultimately contributing to advances in personalized medicine.

## Full-text entities

- **Genes:** TP53 (tumor protein p53) [NCBI Gene 7157] {aka BCC7, BMFS5, LFS1, P53, TRP53}, PTEN (phosphatase and tensin homolog) [NCBI Gene 5728] {aka 10q23del, BZS, CWS1, DEC, GLM2, MHAM}, BRCA1 (BRCA1 DNA repair associated) [NCBI Gene 672] {aka BRCAI, BRCC1, BROVCA1, FANCS, IRIS, PNCA4}
- **Diseases:** hyperlipidemia (MESH:D006949), immunodeficiencies (MESH:D007153), hypertension (MESH:D006973), ESM-1b (MESH:C567213), Cancer (MESH:D009369), AI (MESH:C538142), type 2 diabetes (MESH:D003924), VUS (MESH:D065309), Hypertrophic Cardiomyopathy (MESH:D002312), DL (MESH:D007859), Inherited Retinal Diseases (MESH:D012164), OMIM (MESH:D030342), HCM (MESH:D000092183), Primary Immunodeficiency Diseases (MESH:D000081207), XAI (MESH:C538243)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12888390/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12888390/full.md

## References

141 references — full list in the complete paper: https://tomesphere.com/paper/PMC12888390/full.md

---
Source: https://tomesphere.com/paper/PMC12888390