# AdDeam: a fast and scalable tool for estimating and clustering reference-level damage profiles

**Authors:** Louis Kraft, Thorfinn Sand Korneliussen, Peter Wad Sackett, Gabriel Renaud

PMC · DOI: 10.1093/bioinformatics/btaf407 · Bioinformatics · 2025-07-17

## TL;DR

AdDeam is a new tool that quickly identifies and groups DNA damage patterns in ancient DNA samples, helping researchers detect contamination and authenticate ancient DNA.

## Contribution

AdDeam introduces a fast and scalable method for clustering DNA damage profiles across multiple samples or contigs.

## Key findings

- AdDeam distinguishes different damage levels, including uracil-DNA glycosylase-treated samples and time-period-specific damages.
- The tool can differentiate between contigs containing modern or ancient DNA fragments.
- AdDeam provides a framework for aDNA authentication and large-scale analyses.

## Abstract

DNA damage patterns, such as increased frequencies of C→T and G→A substitutions at fragment ends, are widely used in ancient DNA studies to assess authenticity and detect contamination. In metagenomic studies, fragments can be mapped against multiple references or de novo assembled contigs to identify those likely to be ancient. Generating and comparing damage profiles, however, can be both tedious and time-consuming. Although tools exist for estimating damage in single reference genomes and metagenomic datasets, none efficiently cluster damage patterns.

To address this methodological gap, we developed AdDeam, a tool that combines rapid damage estimation with clustering for streamlined analyses and easy identification of potential contaminants or outliers. Our tool takes aligned ancient DNA (aDNA) fragments from various samples or contigs as input, computes damage patterns, clusters them, and outputs representative damage profiles per cluster, a probability of each sample pertaining to a cluster, as well as a Principal Component Analysis of the damage patterns for each sample for fast visualisation. We evaluated AdDeam on both simulated and empirical datasets. AdDeam effectively distinguishes different damage levels, such as uracil-DNA glycosylase-treated samples, sample-specific damages from specimens of different time periods, and can also distinguish between contigs containing modern or ancient fragments, providing a clear framework for aDNA authentication and facilitating large-scale analyses.

AdDeam is publicly available at https://github.com/LouisPwr/AdDeam and can also be installed via Bioconda. It is implemented in Python and C++. All analysis scripts and datasets are available at https://github.com/LouisPwr/AdDeamAnalysis and on Zenodo under: 10.5281/zenodo.15052427.

## Full-text entities

- **Genes:** UNG (uracil DNA glycosylase) [NCBI Gene 7374] {aka DGU, HIGM4, HIGM5, UDG, UNG1, UNG15}
- **Diseases:** CLASSIC MODE (MESH:D020240), dental calculus (MESH:D003728), periodontal disease (MESH:D010510)
- **Chemicals:** thymine (MESH:D013941), uracil (MESH:D014498), cytosines (MESH:D003596)
- **Species:** Fretibacterium fastidiosum (species) [taxon 651822], Treponema denticola (species) [taxon 158], Pan troglodytes (chimpanzee, species) [taxon 9598], Bacteria Latreille et al. 1825 (Bacteria stick insect, genus) [taxon 629395], Alouatta seniculus (howler monkey, species) [taxon 9503], Alouatta (howler monkeys, genus) [taxon 9499], Tannerella forsythia (species) [taxon 28112], Porphyromonas gingivalis (species) [taxon 837], Gorilla (genus) [taxon 9592], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12317744/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12317744/full.md

## References

53 references — full list in the complete paper: https://tomesphere.com/paper/PMC12317744/full.md

---
Source: https://tomesphere.com/paper/PMC12317744