# MetaComBin: combining abundances and overlaps for binning metagenomics reads

**Authors:** Francesco Tomasella, Cinzia Pizzi

PMC · DOI: 10.3389/fbinf.2025.1504728 · 2025-03-03

## TL;DR

This paper introduces MetaComBin, a new method for improving the accuracy of identifying and grouping microbial species in metagenomics data by combining abundance and overlap information.

## Contribution

The novel contribution is a framework that combines two complementary read-binning approaches to enhance metagenomics binning quality.

## Key findings

- Combining abundance-based and overlap-based methods improves clustering quality in metagenomics.
- MetaComBin performs well even when the number of species is unknown.
- The approach is effective in realistic metagenomics scenarios.

## Abstract

Metagenomics is the discipline that studies heterogeneous microbial samples extracted directly from their natural environment, for example, from soil, water, or the human body. The detection and quantification of species that populate microbial communities have been the subject of many recent studies based on classification and clustering, motivated by being the first step in more complex pipelines (e.g., for functional analysis, de novo assembly, or comparison of metagenomes). Metagenomics has an impact on both environmental studies and precision medicine; thus, it is crucial to improve the quality of species identification through computational tools.

In this paper, we explore the idea of improving the overall quality of metagenomics binning at the read level by proposing a computational framework that sequentially combines two complementary read-binning approaches: one based on species abundance determination and another one relying on read overlap in order to cluster reads together. We called this approach MetaComBin (metagenomics combined binning).

The results of our experiments with the MetaComBin approach showed that the combination of two tools, based on different approaches, can improve the clustering quality in realistic conditions where the number of species is not known beforehand.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC11912761/full.md

---
Source: https://tomesphere.com/paper/PMC11912761