A Modular Open Source Framework for Genomic Variant Calling
Ankita Vaishnobi Bisoi, Shreyas V, Jose Siguenza, Bharath Ramsundar

TL;DR
This paper introduces a modular, open-source framework that integrates DeepVariant into DeepChem, enhancing genomic variant calling accuracy with a CNN-based pipeline suitable for bioinformatics and drug discovery applications.
Contribution
It presents a novel, extensible variant calling pipeline within DeepChem that leverages DeepVariant's CNN architecture for improved genetic variation detection.
Findings
Enhanced variant detection accuracy
Modular pipeline integrated into DeepChem
Supports realignment and candidate detection stages
Abstract
Variant calling is a fundamental task in genomic research, essential for detecting genetic variations such as single nucleotide polymorphisms (SNPs) and insertions or deletions (indels). This paper presents an enhancement to DeepChem, a widely used open-source drug discovery framework, through the integration of DeepVariant. In particular, we introduce a variant calling pipeline that leverages DeepVariant's convolutional neural network (CNN) architecture to improve the accuracy and reliability of variant detection. The implemented pipeline includes stages for realignment of sequencing reads, candidate variant detection, and pileup image generation, followed by variant classification using a modified Inception v3 model. Our work adds a modular and extensible variant calling framework to the DeepChem framework and enables future work integrating DeepChem's drug discovery infrastructure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWikis in Education and Collaboration
