# Origin and Evolution of Very Large Extracellular Proteins in Fructophilic Lactic Acid Bacteria

**Authors:** Julia E Pedersen, Marina Mota-Merlo, Andrea Garcia-Montaner, Maria Selmer, Siv G E Andersson

PMC · DOI: 10.1093/gbe/evag011 · Genome Biology and Evolution · 2026-01-22

## TL;DR

This paper explores the origin and evolution of large surface proteins in bacteria that live in carbohydrate-rich environments like bee habitats.

## Contribution

The study reveals the structural and evolutionary mechanisms behind the diversification of giant extracellular proteins in fructophilic lactic acid bacteria.

## Key findings

- Giant1-4 proteins contain a β-solenoid domain similar to those in serine-rich repeat proteins, which bind to glycoproteins and epithelial cells.
- Phylogenetic analysis shows gene exchange between distantly related bacterial genera and diversification via intra-genic recombination.
- Giant4-5 proteins are unique to A. kunkeei and closely related species, with evidence of co-evolution and recombination.

## Abstract

Large surface proteins in bacteria serve important functions in aggregation, biofilm formation, and cell interaction processes. In Apilactobacillus kunkeei, a defensive symbiont of the honeybee Apis mellifera, as much as 6% of the 1.5 Mb genome consists of 5 consecutive genes for extracellular surface proteins of 3,000 to 8,000 amino acids, named Giant1-5. Here, we predict the structures of these proteins and provide a study of their origin and evolution. The structure predictions suggest that the Giant1-4 proteins contain a β-solenoid domain at their N-terminal ends with similarity to the β-solenoid domain in serine-rich repeat proteins, which mediates binding to glycoproteins, polysaccharides, and epithelial cells. Phylogenetic analyses based on the β-solenoid domains of the Giant1-3 proteins indicate sequence exchange between 2 genera of otherwise distantly related obligate fructophilic lactic acid bacteria, while the diversification of the positional homologs of the giant1-3 genes in the A. kunkeei population is mostly due to short, intra-genic recombination events. Genes for the Giant4-5 proteins were only identified in A. kunkeei and 2 closely related bacterial species, suggesting that they were added to the giant gene cluster more recently. The phylogenetic analyses indicate co-evolution of the giant4-5 genes in A. kunkeei, and the near sequence identity of one of the 2 giant4-5 subtypes correlates with predicted recombination events that span across both genes. Our findings provide new insights into the evolution of very large surface proteins in the bacterial ecosystem adapted to the carbohydrate-rich growth niches provided by bees, their food sources, and food products.

## Linked entities

- **Species:** Apilactobacillus kunkeei (taxon 148814), Apis mellifera (taxon 7460)

## Full-text entities

- **Genes:** SP1 [NCBI Gene 726286]
- **Chemicals:** oxygen (MESH:D010100), carbohydrate (MESH:D002241), serine (MESH:D012694), acids (MESH:D000143), aspartate (MESH:D001224), pyrimidine (MESH:C030986), alanine (MESH:D000409), polysaccharides (MESH:D011134), asparagine (MESH:D001216), glucose (MESH:D005947), amino acid (MESH:D000596), Agarose (MESH:D012685), oligosaccharides (MESH:D009844), lactic acid (MESH:D019344), purine (MESH:C030985), fructose (MESH:D005632)
- **Species:** Sus scrofa (pig, species) [taxon 9823], Aspergillus sp. 09-01 (species) [taxon 528256], Paenibacillus (genus) [taxon 44249], Drosophila melanogaster (fruit fly, species) [taxon 7227], Lactiplantibacillus plantarum (species) [taxon 1590], Fructobacillus pseudoficulneus (species) [taxon 220714], Fructobacillus fructosus (species) [taxon 1631], Bacteria Latreille et al. 1825 (Bacteria stick insect, genus) [taxon 629395], Melanogaster (genus) [taxon 80614], Fructobacillus (genus) [taxon 559173], Apis mellifera (bee, species) [taxon 7460], Apilactobacillus apisilvae (species) [taxon 2923364], Leuconostoc citreum (species) [taxon 33964], Limosilactobacillus reuteri (species) [taxon 1598]
- **Mutations:** A to C, F549S
- **Cell lines:** H4B5-04J — Homo sapiens (Human), Melanoma, Cancer cell line (CVCL_S856), H1B1-04J — Homo sapiens (Human), Childhood T acute lymphoblastic leukemia, Cancer cell line (CVCL_8279), H4B4-02J. — Cricetulus griseus (Chinese hamster), Spontaneously immortalized cell line (CVCL_DD05)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12863079/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12863079/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC12863079/full.md

---
Source: https://tomesphere.com/paper/PMC12863079