The mysterious orphans of Mycoplasmataceae
Tatiana V. Tatarinova, Inna Lysnyansky, Yuri V. Nikolsky, and, Alexander Bolshoy

TL;DR
This study analyzes protein length distributions across prokaryotic genomes, revealing that Mycoplasmataceae have unique ORFan length patterns, challenging typical correlations between homolog presence and protein size.
Contribution
It provides a comprehensive comparison of homologous and orphan protein lengths, highlighting atypical distributions in Mycoplasmataceae genomes and proposing biological explanations.
Findings
HHPs are generally longer than ORFans in most genomes.
Mycoplasmataceae genomes show larger ORFan lengths than HHPs.
Atypical ORFan length distributions are prevalent in Mycoplasma and Ureaplasma genomes.
Abstract
Background: The length of a protein sequence is largely determined by its function, i.e. each functional group is associated with an optimal size. However, comparative genomics revealed that proteins length may be affected by additional factors. In 2002 it was shown that in bacterium Escherichia coli and the archaeon Archaeoglobus fulgidus, protein sequences with no homologs are, on average, shorter than those with homologs. Most experts now agree that the length distributions are distinctly different between protein sequences with and without homologs in bacterial and archaeal genomes. In this study, we examine this postulate by a comprehensive analysis of all annotated prokaryotic genomes and focusing on certain exceptions. Results: We compared lengths distributions of having homologs proteins (HHPs) and non-having homologs proteins (orphans or ORFans) in all currently annotated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMicrobial infections and disease research · Genomics and Phylogenetic Studies · Bacteriophages and microbial interactions
