Identifying viruses from metagenomic data by deep learning
Jie Ren, Kai Song, Chao Deng, Nathan A. Ahlgren, Jed A. Fuhrman, Yi, Li, Xiaohui Xie, Fengzhu Sun

TL;DR
DeepVirFinder is a deep learning-based, reference-free tool that significantly improves the accuracy of viral sequence identification in metagenomic data, aiding virus discovery and potential disease diagnosis.
Contribution
It introduces a novel deep learning method that outperforms existing approaches in identifying unknown viruses from metagenomic sequences.
Findings
DeepVirFinder outperforms VirFinder across all contig lengths.
Adding environmental viral sequences enhances prediction accuracy.
Application to human gut samples identified virus bins linked to colorectal cancer.
Abstract
The recent development of metagenomic sequencing makes it possible to sequence microbial genomes including viruses in an environmental sample. Identifying viral sequences from metagenomic data is critical for downstream virus analyses. The existing reference-based and gene homology-based methods are not efficient in identifying unknown viruses or short viral sequences. Here we have developed a reference-free and alignment-free machine learning method, DeepVirFinder, for predicting viral sequences in metagenomic data using deep learning techniques. DeepVirFinder was trained based on a large number of viral sequences discovered before May 2015. Evaluated on the sequences after that date, DeepVirFinder outperformed the state-of-the-art method VirFinder at all contig lengths. Enlarging the training data by adding millions of purified viral sequences from environmental metavirome samples…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBacteriophages and microbial interactions · Genomics and Phylogenetic Studies · RNA modifications and cancer
