# Sama: a contig assembler with correctness guarantee

**Authors:** Leena Salmela

PMC · DOI: 10.1186/s13015-025-00280-y · Algorithms for Molecular Biology : AMB · 2025-06-03

## TL;DR

This paper introduces Sama, a genome assembly tool that guarantees correctness by estimating misassembly probabilities in de Bruijn graphs.

## Contribution

Sama introduces a novel model for estimating misassembly probabilities in genome assembly, incorporating missing data and providing correctness guarantees.

## Key findings

- Sama produces contigs with correctness guarantees and estimates for each position.
- At high k-mer coverage, Sama's contiguity matches state-of-the-art heuristic-based assemblers.
- The model can be applied to downstream analysis of contigs or de Bruijn graph-based analyses.

## Abstract

In genome assembly the task is to reconstruct a genome based on sequencing reads. Current practical methods are based on heuristics which are hard to analyse and thus such analysis is not readily available.

We present a model for estimating the probability of misassembly at each position of a de Bruijn graph based assembly. Unlike previous work, our model also takes into account missing data. We apply our model to produce contigs with correctness guarantee and correctness estimates for each position in the contigs.

Our experiments show that when the coverage of k-mers is high enough, our method produces contigs with similar contiguity characteristics as state-of-the-art assemblers which are based on heuristic correction of the de Bruijn graph. Our model may have further applications in downstream analysis of contigs or in any analysis working directly on the de Bruijn graph.

## Full-text entities

- **Chemicals:** Sc (MESH:D012538), SAMA (-), S (MESH:D013455)
- **Species:** Escherichia coli (E. coli, species) [taxon 562], Homo sapiens (human, species) [taxon 9606]
- **Mutations:** S in G, S in R

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12135590/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12135590/full.md

## References

1 references — full list in the complete paper: https://tomesphere.com/paper/PMC12135590/full.md

---
Source: https://tomesphere.com/paper/PMC12135590