Scalable Pathogen Detection from Next Generation DNA Sequencing with Deep Learning
Sai Narayanan, Sathyanarayanan N. Aakur, Priyadharsini, Ramamurthy, Arunkumar Bagavathi, Vishalini Ramnath, Akhilesh, Ramachandran

TL;DR
This paper introduces MG2Vec, a deep learning transformer-based method for scalable, robust pathogen detection from raw metagenome sequences, enabling effective analysis of complex, real-world clinical data with minimal supervision.
Contribution
The work presents a novel transformer-based framework for learning robust representations from metagenome sequences, improving pathogen detection and generalization across diseases and species.
Findings
Effective detection of pathogens from uncurated clinical samples.
Learned representations generalize to unrelated pathogens.
Minimal human supervision required for accurate diagnostics.
Abstract
Next-generation sequencing technologies have enhanced the scope of Internet-of-Things (IoT) to include genomics for personalized medicine through the increased availability of an abundance of genome data collected from heterogeneous sources at a reduced cost. Given the sheer magnitude of the collected data and the significant challenges offered by the presence of highly similar genomic structure across species, there is a need for robust, scalable analysis platforms to extract actionable knowledge such as the presence of potentially zoonotic pathogens. The emergence of zoonotic diseases from novel pathogens, such as the influenza virus in 1918 and SARS-CoV-2 in 2019 that can jump species barriers and lead to pandemic underscores the need for scalable metagenome analysis. In this work, we propose MG2Vec, a deep learning-based solution that uses the transformer network as its backbone, to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Microbial infections and disease research · Bacteriophages and microbial interactions
