AFITbin: a metagenomic contig binning method using aggregate l-mer frequency based on initial and terminal nucleotides
Amin Darabi, Sayeh Sobhani, Rosa Aghdam, Changiz Eslahchi

TL;DR
AFITBin is a new method for grouping metagenomic contigs using a novel l-mer frequency approach, improving the accuracy of microbial community analysis.
Contribution
AFITBin introduces a new l-mer statistic vector and matrix factorization method for improved metagenomic binning.
Findings
AFITBin outperforms existing methods in taxonomic identification of metagenomic contigs.
The AFIT vector provides better clustering of species compared to traditional TNF methods.
Abstract
Using next-generation sequencing technologies, scientists can sequence complex microbial communities directly from the environment. Significant insights into the structure, diversity, and ecology of microbial communities have resulted from the study of metagenomics. The assembly of reads into longer contigs, which are then binned into groups of contigs that correspond to different species in the metagenomic sample, is a crucial step in the analysis of metagenomics. It is necessary to organize these contigs into operational taxonomic units (OTUs) for further taxonomic profiling and functional analysis. For binning, which is synonymous with the clustering of OTUs, the tetra-nucleotide frequency (TNF) is typically utilized as a compositional feature for each OTU. In this paper, we present AFIT, a new l-mer statistic vector for each contig, and AFITBin, a novel method for metagenomic…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Machine Learning in Bioinformatics · Gene expression and cancer classification
