# Limitations of sequence dissimilarity as a predictor of prokaryotic lineage

**Authors:** Alvar A. Lavin, Juan Rivas-Santisteban

PMC · DOI: 10.1098/rsob.240302 · Open Biology · 2025-03-19

## TL;DR

This paper questions the reliability of using sequence dissimilarity to infer prokaryotic lineage, showing that genes from distant lineages can become similar over time due to limited polymorphic space.

## Contribution

The paper introduces a novel analysis of how sequence similarity can mislead lineage inference due to polymorphic space limitations in prokaryotic genes.

## Key findings

- Sequence dissimilarity may not reliably indicate lineage divergence due to finite polymorphic space.
- Genes from distant lineages can become similar over time, mimicking phylogenetic relationships by chance.
- The molecular clock assumption may lead to inaccuracies in prokaryotic lineage inference.

## Abstract

The molecular clock rests upon the assumption that the observed changes among sequences capture the differentiation of lineages, or kinship, as dissimilarity increases with time. Although it has been questioned over the years, this paradigmatic principle continues to underlie the idea that the polymorphic space of a gene is so vast that it is unattainable in evolutionary time. Thus, the molecular clock has been used to obtain taxonomic annotations, proving to be very effective at delivering testable results. In this article, however, we ask how often this assumption leads to inaccuracies when inferring the lineage of prokaryotic genes. Thus, we open an interesting discussion by simulating, in realistic scenarios, the critical times in which specific 5S rRNA sequences of two distant lineages are exhausting the polymorphic space. We contend that certain genes in one lineage will become increasingly similar to those in another over time, as the space for new variants is finite, mimicking phylogenetic features by convergence or by chance, without implying true kinship.

## Linked entities

- **Genes:** 5SrRNA (5S ribosomal RNA) [NCBI Gene 857447]

## Full-text entities

- **Chemicals:** proteinogenic amino acids (-), deoxy (MESH:C038782), -nucleotides (MESH:D009711)
- **Species:** Escherichia coli (E. coli, species) [taxon 562], Halorubrum distributum (species) [taxon 29283], Stutzerimonas stutzeri (species) [taxon 316], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11919493/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11919493/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC11919493/full.md

---
Source: https://tomesphere.com/paper/PMC11919493