# Prediction of genetic relatedness of Escherichia coli using neighbor typing: a tool for rapid outbreak detection

**Authors:** Amanda C. Carroll, Leanne Mortimer, Hiren Ghosh, Sandra Reuter, Hajo Grundmann, Karel Brinda, William P. Hanage, Angel Li, Aimee Paterson, Andrew Purssell, Ashley M. Rooney, Noelle R. Yee, Bryan Coburn, Shola Able-Thomas, Martin Antonio, Allison McGeer, Derek R. MacFadden

PMC · DOI: 10.1128/aac.01071-25 · Antimicrobial Agents and Chemotherapy · 2026-01-26

## TL;DR

This study shows that a rapid neighbor typing method with long-read sequencing can accurately predict the genetic relatedness of E. coli, helping detect outbreaks quickly.

## Contribution

The study introduces a rapid neighbor typing method paired with long-read sequencing for accurate and fast genetic relatedness prediction in E. coli.

## Key findings

- Strong correlations were found between neighbor typing and reference methods (Spearman’s rho = 0.75–0.95).
- Using a lineage score-informed approach improved correlations to Spearman’s rho = 0.93–0.94.
- Predicted genetic trees were comparable to reference methods with high cluster and tree similarity metrics.

## Abstract

Identifying the genetic relatedness of resistant bacterial pathogens in healthcare settings can help identify undetected transmission events and outbreaks. However, current methods are time- and resource-intensive. We evaluated a rapid neighbor typing method paired with long-read sequencing for assessment of genetic relatedness. Utilizing a data set of primary clinical samples and published isolate data from two outbreaks of Escherichia coli, we applied genomic neighbor typing of long-read sequence data to rapidly estimate genetic relatedness. We assessed the correlation between neighbor typing predicted genetic distance and pairwise genetic distance from short-read draft whole genomes for all sample pairs. Predicted genetic trees using neighbor typing were compared to reference genetic trees generated using mash distances and maximum-likelihood (ML) methods to assess the extent of agreement, along with metrics of cluster similarity (cluster comparability and Baker’s gamma index [BGI]) and tree topology similarity (generalized Robinson-Foulds [GRF] metric). For all three data sets, we found strong correlations between the reference methods and predicted genetic distances (Spearman’s rho = 0.75–0.95, P < 0.001), which improved when using a lineage score-informed approach (Spearman’s rho = 0.93–0.94, P < 0.001). Predicted genetic trees and clusters from neighbor typing were comparable to those generated using either mashtree or an ML method, with a range of cluster comparability of 85.8–99.5%, BGIs of 0.8–0.95, and GRF values of 0.34–0.8. Pairing the neighbor typing method with long-read sequencing can enable accurate predictions of the relatedness of E. coli samples and isolates, and could potentially be used as a rapid outbreak surveillance tool.

## Linked entities

- **Species:** Escherichia coli (taxon 562)

## Full-text entities

- **Species:** Escherichia coli (E. coli, species) [taxon 562]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12959140/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12959140/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/PMC12959140/full.md

---
Source: https://tomesphere.com/paper/PMC12959140