# wgbstools: a computational suite for DNA methylation sequencing data analysis

**Authors:** Netanel Loyfer, Jonathan Rosenski, Tommy Kaplan

PMC · DOI: 10.26508/lsa.202503514 · Life Science Alliance · 2026-01-29

## TL;DR

wgbstools is a new software suite for analyzing DNA methylation data at a high resolution, offering compact storage and advanced analysis beyond single-CpG sites.

## Contribution

Introduces wgbstools, a computational suite enabling fragment-level methylation analysis with compact storage and privacy-preserving features.

## Key findings

- wgbstools provides a custom epiread file format achieving over 100x compression from BAM files.
- The suite supports fragment-level methylation analysis and visualization across multiple genomic regions and samples.
- Includes algorithms for genomic segmentation, biomarker identification, and integration of genetic and epigenetic data.

## Abstract

wgbstools allows compact data formats, fragment-level analysis, and visualization to reveal biological structure in DNA methylation sequencing beyond single-CpG analysis, while preserving privacy.

Next-generation methylation-aware sequencing of DNA sheds light on the fundamental role of methylation in cellular function in health and disease, increasing the number of covered CpG sites from hundreds of thousands in previous array-based approaches to tens of millions across the whole genome. While array-based approaches are limited to single-CpG resolution, next-generation sequencing allows for a more detailed, single-molecule fragment-level analysis; however, existing tools to fully use this capability are not yet well developed. Here, we present wgbstools, an extensive computational suite tailored for methylation sequencing data. wgbstools allows fast access and ultracompact anonymized representation of high-throughput methylome data, obtained through various library preparation and sequencing methods, with a custom epiread file format achieving a compression factor of over 100x from the input BAM file. In addition, wgbstools contains state-of-the-art algorithms for genomic segmentation, biomarker identification, genetic and epigenetic data integration, and more. wgbstools offers fragment-level analysis and informative visualizations, across multiple genomic regions and samples.

## Full-text entities

- **Genes:** MEST (mesoderm specific transcript) [NCBI Gene 4232] {aka PEG1}, LGR5 (leucine rich repeat containing G protein-coupled receptor 5) [NCBI Gene 8549] {aka FEX, GPR49, GPR67, GRP49, HG38}
- **Diseases:** ONT (MESH:C000719218), cancer (MESH:D009369)
- **Chemicals:** T (MESH:D014316), H (MESH:D006859), cytosine (MESH:D003596), SAMtools (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12861688/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12861688/full.md

## References

39 references — full list in the complete paper: https://tomesphere.com/paper/PMC12861688/full.md

---
Source: https://tomesphere.com/paper/PMC12861688