# MutSeqR: an open source R package for standardized analysis of error-corrected next-generation sequencing data in genetic toxicology

**Authors:** Annette E Dodge, Andrew Williams, Danielle P M LeBlanc, David M Schuster, Elena Esina, Charles C Valentine, Jesse J Salk, Alex Y Maslov, Chris Bradley, Carole L Yauk, Francesco Marchetti, Matthew J Meier

PMC · DOI: 10.1093/bioadv/vbaf265 · 2025-10-23

## TL;DR

MutSeqR is an open-source R package that standardizes the analysis of error-corrected sequencing data for genetic toxicology studies.

## Contribution

MutSeqR introduces a standardized, reproducible workflow for analyzing ECS data in mutagenicity testing.

## Key findings

- MutSeqR enables comparative mutation frequency analysis and dose–response assessment in genetic toxicology.
- The package supports reproducible analyses across different ECS platforms using real-world datasets.
- Sequencing data and variant call files are publicly available for validation and further research.

## Abstract

Error-corrected next-generation sequencing (ECS) methods are increasingly used to assess mutagenicity and other genetic toxicology endpoints. The lack of open and standardized bioinformatic workflows and tools poses challenges to data reproducibility, comparability, and consistency in interpretation for its application in genetic toxicity assessment.

We present MutSeqR, an open source R package to analyse ECS mutation data for genetic toxicology studies. MutSeqR offers practical variant filtering, comparative analysis of mutation frequency between experimental conditions, dose–response assessment via benchmark dose calculations, mutation spectrum analysis, and clonality analyses. We demonstrate MutSeqR’s application using published datasets on mice treated with benzo[a]pyrene or benzo[b]fluoranthene, analysed using Duplex Sequencing and SMM-seq, respectively. MutSeqR’s flexible functions enable reproducible analyses across ECS platforms, facilitating research and regulatory applications in mutagenicity testing.

MutSeqR is freely available under an open source license at https://github.com/EHSRB-BSRSE-Bioinformatics/MutSeqR. Implemented in R (version 3.4.0 or greater), it supports all major operating systems. Sequencing data for Project 1 has been deposited in the Sequence Read Archive under accession number PRJNA803048. Variant call files for Project 2 are available on Mendeley Data (doi: 10.17632/65dnysxym8.1).

## Linked entities

- **Chemicals:** benzo[a]pyrene (PubChem CID 2336), benzo[b]fluoranthene (PubChem CID 9153)
- **Species:** Mus musculus (taxon 10090)

## Full-text entities

- **Diseases:** genetic toxicity (MESH:D030342)
- **Chemicals:** benzo[a]pyrene (MESH:D001564), benzo[b]fluoranthene (MESH:C006703)
- **Species:** Mus musculus (house mouse, species) [taxon 10090]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12645840/full.md

---
Source: https://tomesphere.com/paper/PMC12645840