A Transformers-based framework for refinement of genetic variants
Omar Abdelwahab, Davoud Torkamaneh

TL;DR
This paper introduces a deep learning framework using Transformers to improve the accuracy of genetic variant refinement in sequencing data.
Contribution
The novel contribution is a generalizable Transformer-based framework for variant refinement that integrates with standard pipelines and improves over heuristic filters.
Findings
The framework achieved 89.26% accuracy and a ROC AUC of 0.88 when trained on 2 million variants.
VariantTransformer improved baseline filtering accuracy by 4%–10% across tested samples.
It outperformed traditional filters and approached the accuracy of state-of-the-art AI-based callers like DeepVariant.
Abstract
Accurate variant calling refinement is crucial for distinguishing true genetic variants from technical artifacts in high-throughput sequencing data. While heuristic filtering and manual review are common approaches for refining variants, manual review is time-consuming, and heuristic filtering often lacks optimal solutions, especially for low-coverage data. Traditional variant calling methods often struggle with accuracy, especially in regions of low read coverage, leading to false-positive or false-negative calls. Advances in artificial intelligence, particularly deep learning, offer promising solutions for automating this refinement process. Here, we present a Transformers-based framework for genetic variant refinement that leverages self-attention to model dependencies among variant features and directly processes VCF files, enabling seamless integration with standard pipelines such…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Rare Diseases · Genomics and Phylogenetic Studies · Genetic Associations and Epidemiology
