TL;DR
This paper introduces PhaGCN, a semi-supervised graph convolutional network model that improves taxonomic classification of bacteriophage contigs from metagenomic data by integrating DNA features and gene-sharing information.
Contribution
The paper presents a novel semi-supervised GCN-based approach for phage classification that outperforms existing tools and effectively utilizes both labeled and unlabeled data.
Findings
PhaGCN achieves higher accuracy than existing tools.
The model effectively integrates DNA features and gene-sharing networks.
It performs well on both simulated and real sequencing data.
Abstract
Motivation: Bacteriophages (aka phages), which mainly infect bacteria, play key roles in the biology of microbes. As the most abundant biological entities on the planet, the number of discovered phages is only the tip of the iceberg. Recently, many new phages have been revealed using high throughput sequencing, particularly metagenomic sequencing. Compared to the fast accumulation of phage-like sequences, there is a serious lag in taxonomic classification of phages. High diversity, abundance, and limited known phages pose great challenges for taxonomic analysis. In particular, alignment-based tools have difficulty in classifying fast accumulating contigs assembled from metagenomic data. Results: In this work, we present a novel semi-supervised learning model, named PhaGCN, to conduct taxonomic classification for phage contigs. In this learning model, we construct a knowledge graph by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
