CONCOCT: Clustering cONtigs on COverage and ComposiTion
Johannes Alneberg, Brynjar Smari Bjarnason, Ino de Bruijn, Melanie, Schirmer, Joshua Quick, Umer Z. Ijaz, Nicholas J. Loman, Anders F. Andersson,, Christopher Quince

TL;DR
CONCOCT is a computational tool that integrates sequence composition, coverage, and linkage data to accurately bin contigs into genomes from metagenomic data, aiding microbial community analysis.
Contribution
It introduces a novel method combining multiple data types for improved contig binning in metagenomics, demonstrating high accuracy on real and artificial datasets.
Findings
High recall and precision in binning results
Effective on both artificial and real datasets
Improves genome reconstruction in metagenomics
Abstract
Metagenomics enables the reconstruction of microbial genomes in complex microbial communities without the need for culturing. Since assembly typically results in fragmented genomes the grouping of genome fragments (contigs) belonging to the same genome, a process referred to as binning, remains a major informatics challenge. Here we present CONCOCT, a computer program that combines three types of information - sequence composition, coverage across multiple sample, and read-pair linkage - to automatically bin contigs into genomes. We demonstrate high recall and precision rates of the program on artificial as well as real human gut metagenome datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Gut microbiota and health · Probiotics and Fermented Foods
