# Machine learning models can identify individuals based on a resident oral bacteriophage family

**Authors:** Gita Mahmoudabadi, Kelsey Homyk, Adam B. Catching, Ana Mahmoudabadi, Helen Bermudez Foley, Arbel D. Tadmor, Rob Phillips

PMC · DOI: 10.3389/frmbi.2024.1408203 · Frontiers in Microbiomes · 2024-09-03

## TL;DR

Researchers found that a specific region of oral bacteriophages can be used as a unique identifier for individuals, similar to a fingerprint.

## Contribution

The study introduces the concept of 'phageprints' as a novel method for individual identification using phage terminase diversity.

## Key findings

- Each individual has one or two dominant phage variants with many low-abundance variants.
- Phageprints remain stable over a month and show shared fluctuations between partners.
- Machine learning models can accurately distinguish individuals using phageprints, even after downsampling.

## Abstract

Metagenomic studies have revolutionized the study of novel phages. However these studies trade depth of coverage for breadth. We show that the targeted sequencing of a small region of a phage terminase family can provide sufficient sequence diversity to serve as an individual-specific barcode or a “phageprint’’, defined as the relative abundance profile of the variants within a terminase family. By collecting ~700 oral samples from ~100 individuals living on multiple continents, we found a consistent trend wherein each individual harbors one or two dominant variants that coexist with numerous low-abundance variants. By tracking phageprints over the span of a month across ten individuals, we observed that phageprints were generally stable, and found instances of concordant temporal fluctuations of variants shared between partners. To quantify these patterns further, we built machine learning models that, with high precision and recall, distinguished individuals even when we eliminated the most abundant variants and further downsampled phageprints to 2% of the remaining variants. Except between partners, phageprints are dissimilar between individuals, and neither country-of-residence, genetics, diet nor cohabitation seem to play a role in the relatedness of phageprints across individuals. By sampling from six different oral sites, we were able to study the impact of millimeters to a few centimeters of separation on an individual’s phageprint and found that such limited spatial separation results in site-specific phageprints.

## Full-text entities

- **Species:** Bacteriophage sp. (species) [taxon 38018]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12993541/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12993541/full.md

## References

63 references — full list in the complete paper: https://tomesphere.com/paper/PMC12993541/full.md

---
Source: https://tomesphere.com/paper/PMC12993541