# Pipeline Olympics: continuable benchmarking of computational workflows for DNA methylation sequencing data against an experimental gold standard

**Authors:** Yu-Yu Lin, Kersten Breuer, Dieter Weichenhan, Pascal Lafrenz, Antonella Sarnataro, Agata Wilk, Maryna Chepeleva, Oliver Mücke, Maximilian Schönung, Franziska Petermann, Philip Reiner Kensche, Lena Weiser, Frank Thommen, Gideon Giacomelli, Karl Nordstroem, Edahi Gonzalez-Avalos, Angelika Merkel, Helene Kretzmer, Jonas Fischer, Stephen Krämer, Murat Iskar, Stephan Wolf, Ivo Buchhalter, Manel Esteller, Christian Lawerenz, Sven Twardziok, Marc Zapatka, Volker Hovestadt, Matthias Schlesner, Marcel H Schulz, Steve Hoffmann, Clarissa Gerhauser, Jörn Walter, Mark Hartmann, Daniel B Lipka, Yassen Assenov, Christoph Bock, Christoph Plass, Reka Toth, Pavlo Lutsik

PMC · DOI: 10.1093/nar/gkaf970 · Nucleic Acids Research · 2025-10-21

## TL;DR

This study benchmarks DNA methylation sequencing workflows using a gold-standard dataset and provides an interactive platform for future comparisons.

## Contribution

The study introduces a comprehensive benchmarking framework and an interactive platform for evaluating DNA methylation sequencing workflows.

## Key findings

- Certain workflows consistently outperformed others across multiple metrics.
- The platform allows for adaptable and expandable benchmarking of new software.
- Major trends in workflow development were identified based on performance data.

## Abstract

DNA methylation is a widely studied epigenetic mark and a powerful biomarker of cell type, age, environmental exposures, and disease. Whole-genome sequencing following selective conversion of unmethylated cytosines into thymines via bisulfite treatment or enzymatic methods remains the reference method for DNA methylation profiling genome-wide. While numerous software tools facilitate processing of DNA methylation sequencing reads, a comprehensive benchmarking study has been lacking. In this study, we systematically compared complete computational workflows for processing DNA methylation sequencing data using a dedicated benchmarking dataset generated with five whole-genome profiling protocols. As an evaluation reference, we employed accurate locus-specific measurements from our previous benchmark of targeted DNA methylation assays. Based on this experimental gold-standard assessment and multiple performance metrics, we identified workflows that consistently demonstrated superior performance and revealed major workflow development trends. To ensure the long-term utility of our benchmark, we implemented an interactive workflow execution and data presentation platform, adaptable to user-defined criteria and readily expandable to future software.

Graphical Abstract

## Full-text entities

- **Diseases:** colon tumor (MESH:D003110), DM (MESH:D009223), Cancer (MESH:D009369), CLL (MESH:D015451), colon cancer (MESH:D015179)
- **Chemicals:** FAME (MESH:C508762), cytosine (MESH:D003596), thymines (MESH:D013941), dA (MESH:C025953), Bisulfite (MESH:C042345), EM (MESH:D004961), Biscuit (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12539629/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12539629/full.md

## References

78 references — full list in the complete paper: https://tomesphere.com/paper/PMC12539629/full.md

---
Source: https://tomesphere.com/paper/PMC12539629