# Benchmarking the Taxonomic Resolution of Fish eDNA Metabarcodes Against COI Barcodes

**Authors:** Eliot Ruiz, Thomas Lamy, David Mouillot, Jean‐Dominique Durand

PMC · DOI: 10.1111/1755-0998.70069 · 2025-10-30

## TL;DR

This study compares different DNA metabarcodes for fish species identification, aiming to standardize methods and improve accuracy in biodiversity monitoring.

## Contribution

The paper introduces a framework to evaluate metabarcodes and clustering thresholds for taxonomic resolution using COI BINs as a benchmark.

## Key findings

- Clustering threshold is the most critical factor affecting biodiversity estimates regardless of the method used.
- Taxonomic resolution varies among genes, orders, and community diversity but not with metabarcode length.
- Optimal thresholds for each metabarcode were proposed to minimize over-merging and over-splitting errors.

## Abstract

Even though environmental DNA metabarcoding is revolutionizing biomonitoring, many critical steps remain unstandardized, leading to arbitrary choices, particularly regarding the selection of metabarcode, clustering method and similarity threshold, among others. Additionally, these studies were hindered by biases resulting from the presence of mislabeled sequences in international databases such as GenBank and the lack of explicit definitions for taxonomic resolution. To address these issues, we developed a robust framework to compare the performance of 22 metabarcodes derived from the same mitogenomes (all available for Actinopterygians in NCBI) against a standardized taxonomic baseline based on COI Barcode Index Numbers (BINs). This framework allows for the separate quantification of over‐splitting (splitting the same taxon/BIN) and over‐merging (merging different taxon/BIN). Comparison of OTUs obtained with multiple de novo clustering methods to BINs confirmed the metabarcode ranking based on error sums. Although each metabarcode exhibited varying sensitivities to over‐merging or over‐splitting errors, the clustering threshold emerged as the most important factor influencing biodiversity estimates whatever the clustering method. This led us to propose optimal thresholds for each metabarcode to delineate taxonomic levels (metabarcode gaps). Additionally, we found that taxonomic resolution varied significantly among genes, orders and community diversity, but independently of metabarcode length. Overall, the choice of metabarcode and clustering threshold should aim to minimize over‐merging or over‐splitting while ensuring accurate lower taxonomic delineations. A set of documented R functions makes this evaluation of taxonomic resolution easily applicable to any other taxonomic group for which a representative set of full genes or mitogenomes is available.

## Linked entities

- **Genes:** COX1 (cytochrome c oxidase subunit I) [NCBI Gene 4512]

## Full-text entities

- **Genes:** COX1 (cytochrome c oxidase subunit I) [NCBI Gene 4512] {aka COI, MTCO1}

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12627918/full.md

---
Source: https://tomesphere.com/paper/PMC12627918