Joint processing of long- and short-read sequencing data with deep learning improves variant calling
Gennaro Gambardella

TL;DR
Combining short- and long-read sequencing data with deep learning improves accuracy and cost-efficiency in detecting genetic variants.
Contribution
A hybrid DeepVariant model that jointly processes Illumina and Nanopore data improves germline variant detection accuracy.
Findings
Shallow hybrid sequencing matches or surpasses single-technology methods in variant detection accuracy.
Hybrid sequencing enables detection of large structural variations while reducing sequencing costs.
Joint modeling of hybrid inputs improves DeepVariant's performance compared to single-technology approaches.
Abstract
Despite the complementary strengths of short- and long-read sequencing approaches, variant-calling methods still rely on a single data type. In this study, we collected and harmonized Nanopore datasets of the seven healthy individuals in the GIAB project across three independent consortia. By leveraging these harmonized Nanopore data, we explore the benefits of using a hybrid DeepVariant model to jointly process Illumina and Nanopore data for germline variant detection. We show that a shallow hybrid long-short sequencing approach can match or surpass the germline variant detection accuracy of state-of-the-art single-technology methods, potentially reducing overall sequencing costs and enabling the detection of large germline structural variations. These findings hold great promise for molecular diagnostics in clinical settings, particularly for rare genetic disease screenings. •Hybrid…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Genomics and Rare Diseases · Protist diversity and phylogeny
