# Optimization of Mapping Tools and Investigation of Ribosomal RNA Influence for Data-Driven Gene Expression Analysis in Complex Microbiomes

**Authors:** Ryo Mameda, Hidemasa Bono

PMC · DOI: 10.3390/microorganisms13050995 · 2025-04-26

## TL;DR

This study improves gene expression analysis in complex microbiomes by optimizing mapping tools and addressing the impact of ribosomal RNA on data accuracy.

## Contribution

The study introduces optimized mapping strategies and highlights the need to exclude rRNA reads for accurate TPM calculations in microbiome analysis.

## Key findings

- BWA-MEM outperformed Bowtie2 in mapping efficiency for both metagenomic and metatranscriptomic reads.
- rRNA contamination in metagenomic contigs leads to overestimated TPM changes in metatranscriptomic data.
- Excluding rRNA reads before TPM calculations improves analytical accuracy in microbiome studies.

## Abstract

For gene expression analysis in complex microbiomes, utilizing both metagenomic and metatranscriptomic reads from the same sample enables advanced functional analysis. Due to their diversity, metagenomic contigs are often used as reference sequences instead of complete genomes. However, studies optimizing mapping strategies for both read types remain limited. In addition, although transcripts per million (TPM) is commonly used for normalization, few studies have evaluated the influence of ribosomal RNA (rRNA) in metatranscriptomic reads. This study compared Burrows–Wheeler Aligner–Maximal Exact Match (BWA-MEM) and Bowtie2 as mapping tools for metagenomic contigs. Even after optimizing Bowtie2 parameters, BWA-MEM showed higher efficiency in mapping both metagenomic and metatranscriptomic reads. Further analysis revealed that rRNA sequences contaminate predicted protein-coding regions in metagenomic contigs. When comparing TPM values across samples, contamination by rRNA led to an overestimation of TPM changes. This effect was more pronounced when the difference in rRNA content between samples was larger. These findings suggest that metatranscriptomic reads mapped to rRNA should be excluded before TPM calculations. This study highlights key factors influencing read mapping and quantification in gene expression analysis of complex microbiomes. The findings provide insights for improving analytical accuracy and advancing functional studies using both metagenomic and metatranscriptomic data.

## Full-text entities

- **Diseases:** injury to (MESH:D014947)
- **Chemicals:** SRR22506317 (-), carbon (MESH:D002244), nitrogen (MESH:D009584)
- **Species:** Nitrosomonas sp. (species) [taxon 42353], Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12113988/full.md

---
Source: https://tomesphere.com/paper/PMC12113988