From Bioinformatic Modeling to Clinical Observation: Potential Implications of Ribosomal RNA Folding in Blastocystis sp. Isolates from Symptomatic and Asymptomatic Carriers
Fernando Martínez-Hernández, Arony Martínez, Cecilia Zampedri, Mirza Romero-Valdovinos, Carlos Jiménez-Gutiérrez, Karina Flores-Martínez, Armando Trejo-Chávez, Guiehdani Villalobos, Pablo Maravilla

TL;DR
This study explores how differences in ribosomal RNA structure in Blastocystis isolates from symptomatic and asymptomatic individuals may reflect distinct evolutionary lineages.
Contribution
The study links bioinformatic modeling of rRNA folding with clinical observations, suggesting evolutionary differentiation in Blastocystis subtypes.
Findings
Phylogenetic and haplotype analyses identified genetic subtypes ST1, ST2, and ST3 in Blastocystis isolates.
rRNA secondary structures varied significantly between subtypes, indicating functional relevance.
Interactions between rRNA and ribosomal proteins RPS5 and RPS18 were significant and biologically plausible.
Abstract
Here, 18S-rDNA sequences of Blastocystis sp., previously documented from symptomatic (cases) and asymptomatic (controls) carriers, were analyzed to determine their population structure, predict their secondary structure, and examine their interactions with ribosomal proteins (Bud23, RPS5, and RPS18). Phylogenetic and population differentiation analyses were performed using STRUCTURE software V2.3.4. Moreover, an analysis of the rRNA secondary structure and folding of each sequence was performed, and their probability of interaction with ribosomal proteins was determined. Phylogenetic and haplotype analyses sorted the sequences into genetic subtypes ST1, ST2, and ST3, while the population structure showed each cluster as a differentiated subpopulation, suggesting incipient speciation or cryptic species differentiation. Furthermore, the analysis of the secondary structure of rRNA…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3- —Secretaría de Salud, Mexico, Hospital General “Dr. Manuel Gea Gonzalez”
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParasitic Infections and Diagnostics · Enterobacteriaceae and Cronobacter Research · Coccidia and coccidiosis research
1. Introduction
Blastocystis sp. represents the most prevalent intestinal microeukaryote encountered in coproparasitological investigations worldwide. The prevalence of this organism exhibits significant geographic variation, with rates of 10–15% in developed nations compared to more than 50% in developing countries [1,2].
The pathogenicity of Blastocystis remains a subject of scientific debate. While numerous studies have established associations between Blastocystis colonization and various gastrointestinal and dermatological manifestations, other investigations suggest that it functions as a non-pathogenic commensal organism within the intestinal microbiota of asymptomatic individuals [3,4]. Furthermore, a remarkable characteristic of Blastocystis sp. is its extensive genetic diversity, with 44 distinct genetic subtypes (STs; ST1, ST2, ST3, … ST44) currently documented, all characterized through the analysis of small subunit ribosomal DNA (18S) sequences [5].
On the other hand, ribosomes serve as essential sites of protein synthesis across most organisms, functioning as complex particles primarily composed of ribosomal RNA (rRNA) and proteins. The eukaryotic ribosome consists of two distinct subunits (40S and 60S) that undergo an evolutionarily conserved assembly process. This process requires the precise arrangement of rRNA and more than 79 types of ribosomal proteins (RPs), coordinated by over 200 trans-acting biogenesis factors [6,7,8]. Within ribosomes, 18S-rRNA molecules form the structural core of the 40S subunit, folding into characteristic conformations, where small self-complementary regions create double helices and single-stranded hairpin structures. Research has demonstrated that specific regions of 18S-rRNA, through their secondary and tertiary configurations, facilitate the arrangement and assembly of various ribosomal proteins that constitute the ribosomal subunits [9]. Additionally, numerous investigations have utilized the secondary structures of ribosomal genes to validate or enhance phylogenetic reconstructions, highlighting their evolutionary significance [10,11]. Although 18S-rRNA is not commonly associated with pathogenicity, variation in its secondary structure can influence phenotype by tuning protein synthesis. Local shifts in 18S accessibility or conformation may alter prerRNA processing, small-subunit assembly, ribosomal protein recruitment, and translational efficiency, reshaping growth and environmental responses and, ultimately, traits linked to pathogenicity. Consistent with this view, temperature-dependent remodeling of the 18S structure and expression has been observed in Plasmodium, supporting a regulatory contribution of the rRNA structure to stage- and environment-specific phenotypes [12]. This mechanistic perspective leverages genomic evidence within an evolutionary framework, enabling phenotype-level explanations without relying on rigid virulence factor paradigms.
RPs are essential components that regulate protein synthesis and facilitate translation processes. However, numerous potential biological functions of these proteins remain unexplored [6]. Among the RPs most extensively investigated for their significant roles in diverse biological processes are Bud23, S5 (RPS5), and S18 (RPS18). It is important to note that Bud23, while not formally an RP, is a highly conserved methyltransferase that facilitates the critical processome-to-pre-40S transition during ribosome biogenesis [7,13].
The eukaryotic ribosomal protein RPS5 is located at the head of the 40S ribosomal subunit (5′ dominion) and belongs to a highly conserved ribosomal protein family. Beyond its fundamental role in translation, RPS5 exhibits significant extra-ribosomal functions. Research has demonstrated that RPS5 participates in cellular proliferation and apoptotic pathways [7,14] and contributes to hepatic pathophysiology [15].
Ribosomal protein RPS18 functions as a critical housekeeping component that specifically binds to 18S-rRNA, thereby stabilizing its structure and facilitating the assembly of the 40S subunit in eukaryotic cells. This protein plays an essential role in the binding of fMet-tRNA, which is crucial for translation initiation. Beyond its primary function in protein synthesis, RPS18 exhibits additional roles in various organisms, including modulation of bacterial growth patterns [16,17].
Previous research conducted on Blastocystis stool cultures compared isolates from patients experiencing gastrointestinal disorders (cases) with those from asymptomatic carriers (controls). This investigation revealed that isolates from symptomatic patients demonstrated significantly reduced growth rates and decreased nucleotide variability compared to those from asymptomatic individuals. Despite these physiological differences, Bayesian phylogenetic analysis successfully identified subtypes ST1, ST2, ST3, and ST7 across all isolates, but failed to establish distinct phylogenetic clusters that separated cases from controls [18].
The present study aims to conduct a comprehensive analysis of the population structure in conjunction with phylogenetic network assessment, while also predicting the secondary structure and folding patterns of 18S-rRNA in Blastocystis sp. isolates. Furthermore, we investigate potential interactions between these RNA structures and ribosomal proteins, utilizing previously documented sequence data obtained from both symptomatic and asymptomatic carriers to elucidate possible molecular mechanisms underlying pathogenicity.
2. Materials and Methods
2.1. Sequence Data
For this study, we obtained 18S-rDNA sequence data from 96 Blastocystis sp. isolates (49 from symptomatic patients and 47 from asymptomatic carriers) previously documented by Vargas-Sanchez et al. [18]. Sequences were retrieved from GenBank (https://www.ncbi.nlm.nih.gov/genbank/, accessed on 3 February 2025) using accession numbers KP055659–KP055754, along with corresponding clinical metadata. These sequences map to the 5′ region of the complete 18S-rDNA gene, approximately positions 26 to 340.
2.2. Sequence Alignments
We implemented SSU-ALIGN v0.1.1 [19] in DNA mode with model guidance to align study sequences against a Blastocystis-specific consensus model derived from curated full-length references across all subtypes and sequences from animals not included in the subtype system [20]. Secondary structure support was estimated with RNAalifold v2.4.17 [21] using non-default settings (aln, color, dangles = 2) across five temperatures (37–41 °C), and covariance models were built with ssu-build [19], enforcing an 80% gap threshold (gapthresh 0.8) while disabling relative-entropy filtering (enone). To reduce spurious positions, we compared three masking schemes in ssu-mask [19]—strict (pf 0.95, pt 0.95), medium (pf 0.90, pt 0.92; key “med90”), and relaxed (pf 0.85, pt 0.90)—and retained the medium mask for downstream analyses due to its balance of coverage and structural consistency.
Study sequences were then aligned against this model, producing structurally consistent alignments that account for conservation at the sequence and structural levels, providing the foundation for subsequent analyses.
2.3. Phylogeny and Haplotype Network Analysis
A median-joining network analysis was carried out using NETWORK 4.6 software [22], and haplotype networks were depicted under default settings and assumptions. For all analyses, the following sequences of Blastocystis available in GenBank were used as controls of subtypes: ST1: HQ641595-6; ST2: HQ641602, HQ641654, and JX305874-5; ST3: JX305879, HQ641613, JX305880, JX305883, and HQ641611.
To compare the match between phylogenetic Blastocystis subtypes and the population analysis, a Bayesian inference was performed using Mr. Bayes 3.1.2 program [23] for 10 million generations with sampling trees every 100 generations. Inferences that reached the stationary phase were collected and used to build a consensus tree.
2.4. Population Structure Analysis
The STRUCTURE analysis with software V2.3.4 [24] was carried out to determine the most probable number of clusters across all Blastocystis samples with each ST. The value of K, representing the theoretical number of independent populations (considering two groups—cases and controls—and three STs), was established using the predetermined values of the software: correlated allele frequencies and admixture [25].
The algorithm starts with an initial random association of alleles into K clusters and was executed with 10 independent replicates for each value of K from 1 to 10 using a burn-in period of 10,000 and 100,000 repetitions after burn-in. The appropriate number of clusters was determined by calculating the delta K value [25]. A second run was performed using the delta K value assigned to the program (K = 2 or K = 3). These new analyses were conducted with a burn-in period of 20,000 and 100,000 Markov chain Monte Carlo (MCMC) repetitions after burn-in [25].
2.5. RNA Secondary Structure and Folding
The 18S sequences of Blastocystis sp. from cases and controls (320 bp) were transcribed into 18S-rRNAs. Then, these fragments were inserted into consensus alignments of whole 18S-rRNA of Blastocystis (positions 22 to 352 in the 5′ domain, according to Spahn et al. [26] and Granneman et al. [27]) for STs 1–3. The secondary structures and folding were predicted under the RNAfold web server [28], and the minimum free energy (ΔG) was estimated for stability comparison.
2.6. Prediction of RNA–Protein Interactions
The interaction between 18S-rRNA sequences of Blastocystis from cases and controls and Bud23, RPS5, and RPS18 was analyzed; sequences of these RPs from ST1 and ST4 genomes of Blastocystis available in GenBank (S18: OAO11934 and XP 014528798.1; S5: OAO14796) were downloaded. Then, analysis of RNA–protein interactions was performed online using RPISeq software V1.0 [27], contrasting the converted sequences of each 18S-rRNA from cases and controls with the sequences of Bud23, RPS5, and RPS18 to predict the probability of interaction between them using a Support Vector Machine (SVM) algorithm.
2.7. Statistical Analysis
Student’s t-test was used to compare means for ribosomal proteins RPS5, RPS18, and Bud23, and Levene’s test was used to assess homogeneity of variance. Mean differences and their respective 95% confidence intervals (95% CI) were also analyzed. For the analysis of ST1, ST2, and ST3, proportions were tested using the X^2^ statistic, and compliance with the expected value assumption (e > 5) was verified. The analysis assumes a one-tailed hypothesis, with a significance level of p ≤ 0.05, with 95% CI. The analysis was performed using SPSS version 31 for iOS (SPSS, Chicago, IL, USA).
3. Results
3.1. Phylogenies and Population Structure Analysis
Phylogenetic analysis revealed distinct clustering patterns for ST1, ST2, and ST3, with no discernible segregation between symptomatic cases and asymptomatic controls (Figure 1A). Consistent with these findings, STRUCTURE analysis generated a histogram clearly delineating three genetically distinct subpopulations corresponding to each ST (Figure 1C). In contrast, when the analysis was conducted under the assumption of two subpopulations (cases versus controls), no significant clustering pattern emerged (Figure 1B).
The haplotype network analysis corroborated the phylogenetic findings, with sequences clustering into specific groups corresponding to each of the STs (Figure 2). While this analysis similarly failed to differentiate between case and control haplotypes, it revealed identical haplotypes shared between symptomatic and asymptomatic carriers across all subtypes.
3.2. 18S-rRNA Secondary Structure and Conformational Analysis
As illustrated in Figure 3, the secondary structure of 18S-rRNA exhibited similar architectural frameworks across all STs, characterized by a conserved scaffold with subtype-specific variations throughout the molecule. Each subtype displayed a primary helix framework that was consistent between cases and controls, though with distinctive substructural elements and loops, i.e., small double strands or loop projections derived from the main structure. Following the nomenclature established by Van de Peer et al. [29] and Wuyts et al. [30], individual helices and substructural loops were enumerated in a clockwise orientation from the 5′ to the 3′ terminus. This analysis identified 14 distinct substructures in ST1 and ST2, while ST3 exhibited 11 substructures (Figure 3). Notably, ST1 displayed conformational polymorphism in substructures 1 and 13, whereas ST2 and ST3 each presented four conformational variants. Importantly, no case-specific or control-specific substructural elements were identified in any of the subtypes.
3.3. RNA–Protein Interaction Analysis
Based on the interaction probability data between ribosomal proteins (Bud23, RPS5, and RPS18) and 18S-rRNA sequences from Blastocystis isolates (Supplementary Materials, Table S1), we conducted comparative analyses between subtypes and between case and control groups. Table 1 summarizes these comparisons. All three ribosomal proteins demonstrated high interaction probabilities (mean > 0.91) with18S-rRNA, with Bud23 showing the highest values, followed by RPS18 and RPS5, respectively. The majority of the data points clustered between the lower quartile (Q1) and the median (Supplementary Materials, Figure S1). Notably, when comparing overall case versus control values, statistically significant differences were observed for RPS18 and RPS5 interactions; however, when stratified by subtype, no significant differences were detected between cases and controls within individual STs.
4. Discussion
Blastocystis STs are considered to be ribosomal linages [20], aligning with criteria established by Jacob et al. [31], which characterize ribosomal lineages as organisms with ≥80% of the 18S-rDNA gene sequenced, but lacking definitive morphological differentiation. These lineages may ultimately be reclassified as discrete, pending further morphological and molecular characterization. Our phylogenetic analyses and STRUCTURE histogram data provide robust support for the classification of ST1, ST2, and ST3 as distinct ribosomal lineages, offering compelling evidence for their status as cryptic species with genetically differentiated populations despite morphological homogeneity.
In human18S-rDNA, the copy number of ribosomal genes per cell demonstrates significant interpersonal variation, ranging from 67 to 412 copies (mean = 217) [32]. Comparatively, Blastocystis ST7 has been documented to contain 17 tandem copies [33], highlighting species-specific patterns of ribosomal gene organization.
The eukaryotic ribosomal RNA gene (rDNA) is structurally organized into multiple tandem-repeating units. These units of rDNA undergo coordinated evolution within species, resulting in homogenization of repetitive DNA sequences within a species while maintaining sequence divergence between species or higher taxonomic categories. This molecular evolutionary phenomenon is termed concerted evolution [34,35].
Concerted evolution provides a valuable framework for understanding cryptic speciation phenomena. For example, a phylogenetic investigation incorporating rRNA secondary structure analysis of eleven representative solefish specimens (family Soleidae) demonstrated that six species exhibited minimal variation, suggesting a concerted evolutionary pattern, while other samples suggested non-concerted evolutionary dynamics. Additionally, this analytical approach facilitated taxonomic clarification, resolving the previous misclassification of a genus [34].
In the present study, our analysis of the specific 18S-rRNA secondary structure arrangements for each subtype, coupled with the distinct subpopulation clustering observed in the STRUCTURE analysis, strongly suggests that concerted evolution drives the differentiation of Blastocystis subtypes. This evolutionary mechanism is consistent with the genomic organization of ribosomal genes in most eukaryotic organisms, where rDNA exists in tandem arrays and undergoes concerted evolution [34,36,37].
Our computational predictions of RNA–protein interactions revealed a hierarchical pattern of binding probabilities, with Bud23 demonstrating the highest interaction probability, followed by RPS18 and RPS5. Statistical analysis uncovered significant differences in RPS18 and RPS5 interaction probabilities between symptomatic and asymptomatic carriers in the overall sample. However, these differences were not observed when the interactions were stratified by subtype, suggesting subtype-specific interaction patterns. These findings align with the established functional roles of these ribosomal proteins. Bud23, as a highly conserved methyltransferase, functions during the early processome-to-pre-40S transition stage [7,13], making its interactions relatively resistant to 18S-rRNA sequence variations. Conversely, RPS18 and RPS5 directly bind to 18S-rRNA to stabilize its structure and facilitate 40S subunit assembly [16,17]. Consequently, sequence variations in 18S-rRNA more profoundly affect these later-stage interactions, potentially compromising assembly efficiency.
The bioenergetics of ribosome maturation represents a substantial cellular investment, with eukaryotic cells dedicating a majority of their metabolic energy to producing functional ribosomes [24]. The 5′ terminal region of 18S-rRNA exhibits considerable species-specific and cell type-specific variability [24]; nevertheless, its fundamental role in translation initiation and protein synthesis remains universally critical across eukaryotic lineages [6]. These functional constraints likely influence the evolutionary patterns observed in Blastocystis ribosomal RNA.
Previous research has documented that Blastocystis exhibits variable generation times (GT) in axenic culture, ranging from 8.5–19.4 h depending on strain, with a mean GT of 11.7 h across eight experimentally tested strains [38]. In an investigation conducted by Vargas-Sanchez et al. [18], fecal samples from symptomatic (cases) and asymptomatic (controls) Blastocystis carriers were cultured in two distinct media. This study revealed that isolates from symptomatic patients demonstrated significantly reduced growth rates compared to those from asymptomatic controls, suggesting enhanced protein synthesis efficiency in the control isolates. Furthermore, the controls exhibited greater nucleotide diversity relative to the case group. This observation aligns with established eukaryotic growth patterns, wherein hundreds of copies of transcriptional units are encoded during proliferative phases to accommodate increased ribosomal demand [39]. Consequently, greater sequence variation would be expected in samples displaying normal growth kinetics, such as those from the control group. These findings support the biological plausibility that subtle conformational alterations in ribosome assembly may substantially influence protein synthesis rates.
In our current investigation, the probability of interaction between RPS18 and RPS5 and 18S-rRNA was significantly higher in the control sequences compared to the case sequences. This molecular distinction may elucidate the observed reduction in generation time, manifesting as accelerated growth in control isolates relative to those from symptomatic cases. The enhanced RNA–protein interaction efficiency could potentially facilitate more rapid ribosomal assembly and consequently expedite protein synthesis in asymptomatic carriers.
It is noteworthy that ribosomal proteins may exhibit functions beyond their canonical roles in protein synthesis. For instance, a comprehensive functional characterization of RPS18 from the cattle tick Rhipicephalus microplus demonstrated that recombinant RPS18 significantly inhibited the growth of both Gram-negative and Gram-positive bacteria under in vitro conditions [16]. This finding suggests potential auxiliary functions of ribosomal proteins that may influence pathogen–host interactions in ways not directly related to translation efficiency.
While some studies have identified associations between symptomatic presentation and infection with specific Blastocystis subtypes [40,41], our findings do not support a direct correlation between virulence/pathogenicity and 18S-rRNA sequence composition. Although multiple studies have documented the release of factors that may induce inflammatory processes in the host [42,43,44], these pathogenic mechanisms have not been demonstrated to correlate with 18S-rRNA configuration or sequence variations.
Some limitations of the present study are as follows: First, complete 18S-rDNA sequencing was not performed, potentially underestimating the frequency of rRNA secondary structure folding polymorphisms. Second, pseudogene identification protocols were not implemented, introducing potential bias in the enumeration of substructure loops across STs. Third, there may be a potential sampling bias, because a small number of case and control samples were analyzed from a specific population.
5. Conclusions
Our findings reinforce the hypothesis that ribosomal subtypes ST1, ST2, and ST3 of Blastocystis represent evolutionarily distinct lineages with the potential to be recognized as future species. Furthermore, they underscore the functional relevance of 18S-rRNA sequences from clinical isolates of Blastocystis, suggesting the operation of concerted evolutionary processes in this organism. Furthermore, the differential interaction probabilities between Blastocystis 18S-rRNA secondary structures and ribosomal proteins RPS5 and RPS18 offer a mechanistic explanation for previously documented differences in generation times between isolates from symptomatic and asymptomatic carriers.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Tan K.S. New insights on classification, identification, and clinical relevance of Blastocystis spp.Clin. Microbiol. Rev.20082163966510.1128/CMR.00022-0818854485 PMC 2570156 · doi ↗ · pubmed ↗
- 2Kwon J.Y. Choi J.H. Lee H.I. Ju J.W. Lee M.R. Molecular Prevalence of Blastocystis sp. from Patients with Diarrhea in the Republic of Korea Microorganisms 20241252310.3390/microorganisms 1203052338543574 PMC 10972355 · doi ↗ · pubmed ↗
- 3Olyaiee A. Sadeghi A. Yadegar A. Mirsamadi E.S. Mirjalali H. Gut Microbiota Shifting in Irritable Bowel Syndrome: The Mysterious Role of Blastocystis sp.Front. Med.2022989012710.3389/fmed.2022.890127 PMC 925112535795640 · doi ↗ · pubmed ↗
- 4Stensvold C.R. Tan K.S.W. Clark C.G. Blastocystis Trends Parasitol.20203631531610.1016/j.pt.2019.12.00832001134 · doi ↗ · pubmed ↗
- 5Santin M. Figueiredo A. Molokin A. George N.S. Köster P.C. Dashti A. González-Barrio D. Carmena D. Maloney J.G. Division of Blastocystis ST 10 into three new subtypes: ST 42–ST 44J. Eukaryot. Microbiol.202471 e 1299810.1111/jeu.1299837658622 · doi ↗ · pubmed ↗
- 6Qiu L. Chao W. Zhong S. Ren A.J. Eukaryotic Ribosomal Protein S 5 of the 40S Subunit: Structure and Function Int. J. Mol. Sci.202324338610.3390/ijms 2404338636834797 PMC 9958902 · doi ↗ · pubmed ↗
- 7Black J.J. Johnson A.W. Release of the ribosome biogenesis factor Bud 23 from small subunit precursors in yeast RNA 20222837138910.1261/rna.079025.12134934010 PMC 8848936 · doi ↗ · pubmed ↗
- 8Rodgers M.L. Woodson S.A. A roadmap for r RNA folding and assembly during transcription Trends Biochem. Sci.20214688990110.1016/j.tibs.2021.05.00934176739 PMC 8526401 · doi ↗ · pubmed ↗
