# krepp: a k-mer-based maximum pseudo-likelihood method for estimating read distances and genome-wide phylogenetic placement

**Authors:** Ali Osman Berk Şapcı, Siavash Mirarab

PMC · DOI: 10.1186/s13059-026-03999-y · 2026-02-21

## TL;DR

krepp is a new method that uses k-mers to accurately place sequencing reads on large phylogenies, improving metagenomic analysis.

## Contribution

krepp introduces a scalable, alignment-free method for genome-wide phylogenetic placement using k-mers.

## Key findings

- krepp computes accurate read distances comparable to alignment-based methods.
- krepp enables phylogenetic placement at scale on ultra-large reference trees.
- The method improves metagenomic sample comparison and characterization.

## Abstract

Comparing each sequencing read in a sample to a reference database is a fundamental step in wide-ranging applications. Results of these comparisons can enable phylogenetic characterization. However, phylogenetic placement is currently only possible at scale for marker genes, a small fraction of the genome. We introduce krepp, an alignment-free k-mer-based method that enables placing reads from anywhere on the genome on an ultra-large reference phylogeny (e.g., 123,853 leaves). We show that krepp is scalable and computes accurate distances that approximate those using alignments, leading to accurate placements. These precise phylogenetic identifications improve our ability to compare and characterize metagenomic samples.

The online version contains supplementary material available at 10.1186/s13059-026-03999-y.

## Full-text entities

- **Diseases:** CAMI-II (MESH:C537730), HD (MESH:C535290), LSH (MESH:D004828)
- **Chemicals:** saline (MESH:D012965), EMPO 3 (-)
- **Species:** Escherichia coli (E. coli, species) [taxon 562], Homo sapiens (human, species) [taxon 9606], Mycobacterium (genus) [taxon 1763]

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13032499/full.md

---
Source: https://tomesphere.com/paper/PMC13032499