Machine learning models for delineating marine microbial taxa
Stilianos Louca

TL;DR
This paper uses machine learning to classify marine microbial genomes, helping to better understand and identify new microbial taxa based on genetic differences.
Contribution
The study introduces machine learning models that accurately delineate marine microbial taxa using genome similarity metrics.
Findings
Machine learning classifiers achieved over 92% balanced accuracy in delineating marine microbial taxa.
Gene categories related to cofactor and vitamin metabolism are strongly correlated with taxon divergence.
Over half of marine prokaryotic phyla, classes, and orders have been identified through metagenomic surveys.
Abstract
The relationship between gene content differences and microbial taxonomic divergence remains poorly understood, and algorithms for delineating novel microbial taxa above genus level based on multiple genome similarity metrics are lacking. Addressing these gaps is important for macroevolutionary theory, biodiversity assessments, and discovery of novel taxa in metagenomes. Here, I develop machine learning classifier models, based on multiple genome similarity metrics, to determine whether any two marine bacterial and archaeal (prokaryotic) metagenome-assembled genomes (MAGs) belong to the same taxon, from the genus up to the phylum levels. Metrics include average amino acid and nucleotide identities, and fractions of shared genes within various categories, applied to 14 390 previously published non-redundant MAGs. At all taxonomic levels, the balanced accuracy (average of the…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Microbial Community Ecology and Physiology · Oral microbiology and periodontitis research
