Whole-Genome Phylodynamic Analysis of Respiratory Syncytial Virus—Maryland, USA, 2018–2024
Ting-Xuan Zhuang, Amary Fall, Julie M. Norton, Omar Abdullah, Andrew Pekosz, Eili Klein, Heba H. Mostafa

TL;DR
This study tracks the genetic changes in RSV in Maryland from 2018 to 2024, showing how the virus evolved before and during the pandemic.
Contribution
The study provides a detailed phylodynamic analysis of RSV-A and RSV-B in Maryland, highlighting clade replacements and viral adaptation during the pandemic.
Findings
RSV-A dominated most seasons, while RSV-B had surges in 2018 and 2023.
RSV-A and RSV-B showed distinct evolutionary rates and genetic variability in the G glycoprotein.
Pandemic-related disruptions influenced the effective population size of RSV.
Abstract
Respiratory syncytial virus (RSV) is a leading cause of respiratory infections in infants and older adults, with epidemiological patterns shaped by viral evolution and diversity. To investigate the molecular epidemiology of RSV before and after the COVID-19 pandemic, we conducted genomic surveillance and phylodynamic analyses of RSV-A and RSV-B circulating in Maryland from 2018 to 2024. Whole-genome sequencing of RSV-positive samples (n = 451) was performed, and genomes were analyzed with phylogenetic and Bayesian methods to estimate evolutionary rates, population dynamics, selection pressures, and genetic diversity. RSV-A predominated in most seasons, while RSV-B showed episodic surges in 2018 and 2023. All RSV-A genomes belonged to the ON1 genotype, and RSV-B belonged to BA9, with sequential clade dominances including A.D.1, A.D.5.2, A.D.1.6, and B.D.E.1 across different epidemic…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5- —Johns Hopkins Center of Excellence in Influenza Research and Response
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRespiratory viral infections research · Genomics and Rare Diseases · Virus-based gene therapy research
1. Introduction
Respiratory syncytial virus (RSV) is a major cause of acute respiratory tract infections globally, leading to significant morbidity and mortality among infants, immunocompromised individuals, and older adults [1]. As an RNA virus characterized by rapid evolutionary dynamics and high genetic variability, RSV is classified into two distinct antigenic groups, RSV-A and RSV-B, each with multiple circulating genotypes [2]. RSV genotypes, historically defined by G gene variation (14 for RSV-A and 37 for RSV-B), were later reshaped by the emergence of the BA9 (RSV-B) and ON1 (RSV-A) genotypes, which rapidly displaced earlier genotypes and became globally dominated [3]. More recently RSV was classified into 3 RSV-A and 7 RSV-B genotypes, with whole-genome analyses resolving 24 RSV-A and 16 RSV-B clades [4,5].
Mutations drive the molecular evolution of RNA viruses, influencing their genetic variability, adaptability, and overall evolutionary trajectory [6]. RSV exhibits continuous evolutionary change driven by selective pressure from host immunity and antiviral interventions [7,8]. This dynamic evolution manifests genetically through nucleotide substitutions and amino acid alterations, affecting infectivity, virulence, and immune evasion [9].
Previous studies have provided valuable insights into RSV prevalence, strain circulation, and phylogenetic relationships across various seasons through genomic surveillance efforts [10,11,12]. However, phylodynamic analyses that integrate evolutionary, temporal, and demographic information to understand the transmission dynamics remain limited, especially in the context of changing epidemiological patterns such as those observed during and after the COVID-19 pandemic. Detailed investigations into RSV’s evolutionary dynamics across diverse geographic and temporal contexts are critical, as such insights inform public health interventions and guide vaccine development strategies.
In this study, we conducted a retrospective phylogenetic and phylodynamic analysis using longitudinal genomic data collected from the Johns Hopkins Health System (JHHS) over seven consecutive years, from 2018 to 2024. Our aim was to understand the genetic diversity, evolutionary patterns, and phylogenetic relationships among RSV strains circulating locally in Maryland pre, during, and post the COVID-19 pandemic.
2. Materials and Methods
2.1. Patient Consent and Study Specimens
The study received approval from the Johns Hopkins Institutional Review Board (IRB00221396, initial approval on 29 October 2019) and was granted a waiver of consent. Respiratory specimens were collected retrospectively with collection dates from 2018 to 2024. Standard-of-care RSV diagnostic testing at JHHS during this period was performed using the Cepheid Xpert Xpress SARS-CoV-2/Flu/RSV, the Xpert Xpress Flu/RSV, the NxTAG-RPP, or the cobas ePlex RP1/RP2 respiratory panels [13,14].
2.2. RNA Extraction, Real-Time RT-PCR Assay, and Two-Step Multiplex RT-PCR
RNA was extracted from 300 μL of clinical specimen and eluted in 60 μL volume using the Chemagic Viral DNA/RNA 360 Kit (Revvity, Waltham, MA, USA). Cycle threshold (Ct) values were determined via a research-use-only RT-qPCR assay targeting the RSV matrix (M) gene, utilizing the Luna^®^ Universal Probe One-Step RT-qPCR Kit (E3006) from New England Biolabs (NEB) (New England Biolabs, Ipswich, MA, USA) [15]. First-strand cDNA synthesis was carried out using the LunaScript RT SuperMix Kit (NEB) with 8 μL of extracted RNA in a total reaction volume of 10 μL under the following conditions: 25 °C for 2 min, 55 °C for 10 min, and 95 °C for 1 min. Specimens with Ct ≤ 35 were selected for downstream sequencing. Amplification of cDNA was performed in two multiplex PCR reactions (Pool 1 and Pool 2). Each PCR reaction (25 μL) contained 5 μL of cDNA, 12.5 μL Q5 Hot Start High-Fidelity 2× Master Mix (NEB), 3.5 μL nuclease-free water, and multiplex primers at final concentrations of either 200 nM or 400 nM (achieved by adding 0.5 μL or 1.0 μL of each 10 μM primer stock, respectively). PCR conditions included initial denaturation at 98 °C for 30 s, followed by 40 cycles at 98 °C for 10 s, 50 °C for 30 s, 72 °C for 4 min, and a final extension at 72 °C for 10 min. Primer sequences are listed in Supplementary Table S1.
2.3. Library Preparation, Genome Assembly, and Phylogenetic Trees Construction
Sequencing libraries were prepared using the NEBNext ARTIC Library Prep Kit and the Oxford Nanopore Native Barcoding Kit 96 (V14). Libraries were sequenced using an Oxford Nanopore GridION instrument (Oxford Nanopore Technologies, Oxford, UK). Sequence data were processed with an established in-house bioinformatics pipeline, as previously described [16]. Closest reference genomes were identified by BLAST (2.17.0) (MH447951 for RSV-A and OP975389 for RSV-B), and draft genomes were assembled with the mini_assemble module in Pomoxis. Consensus polishing was performed with Medaka, and sequencing depth was assessed using samtools. Quality control filters were applied to retain sequences with mean quality scores ≥ 30, and low-quality or incomplete genomes were manually excluded after sequencing. RSV sequences were assigned to clades using Nextclade (v3.19.0), and multiple sequence alignment was conducted using MAFFT v7 [17,18]. Phylogenetic trees were constructed employing maximum likelihood methods implemented in IQ-TREE v2.4.0, using 1000 bootstrap replicates. ModelFinder in IQ-TREE identified GTR + F + I + R3 as the optimal nucleotide substitution model [19]. Temporal signals were assessed using a root-to-tip regression in Tempest v1.5.3 with the best-fitting root [20]. Tree visualization was performed with FigTree v1.4.4. Additional reference sequences were used only in phylogenetic analyses (Supplementary Table S2) but were excluded from Bayesian analyses.
2.4. Estimated Evolutionary Rates, Bayesian Skyline Plot, and MCMC Trees Reconstruction
RSV evolutionary dynamics were investigated using Bayesian Markov Chain Monte Carlo (MCMC) analyses in BEAST v2.7.7 [21]. Of the four clock models tested (strict, uncorrelated exponential, uncorrelated lognormal, and random local), the uncorrelated lognormal relaxed clock provided the best fit with the highest log marginal likelihood (Supplementary Table S3). Bayesian Skyline Plot (BSP) analyses utilized the GTR + Γ substitution model, an uncorrelated lognormal relaxed molecular clock, and a Bayesian Skyline coalescent prior [22] with a temporal bin of 5. Convergence and performance of MCMC chains were monitored with Tracer v1.7.2, ensuring effective sample sizes (ESSs) exceeded 200 for all parameters. MCMC simulations were executed as six independent runs of 50 million generations each (totaling 300 million generations), sampled every 2000 generations (25,000 states per chain), discarding the initial 20% as burn-in prior to combining logs and trees using LogCombiner. Parameter uncertainty was summarized by 95% of the highest posterior density (HPD) intervals. Maximum clade credibility (MCC) trees were generated using Tree Annotator v2.7.7 and visualized with FigTree v1.4.4.
2.5. Positive Selection Analysis
To identify selective pressures on RSV genomes, nonsynonymous (dN) and synonymous (dS) substitution rates per site were calculated using the Nei–Gojobori method with Jukes–Cantor correction, as implemented in MEGA v12 [23]. Genes displaying an overall dN/dS ratio indicative of positive selection (dN/dS ratio > 1) underwent further site-specific analysis using the Datamonkey web server (http://www.datamonkey.org/, accessed on 15 July 2025) [24]. This gene-level threshold was applied as a conservative prioritization step to focus codon-based testing. Four complementary methods—Single-Likelihood Ancestor Counting (SLAC), Fixed Effects Likelihood (FEL), Mixed Effects Model of Evolution (MEME), and Fast, Unconstrained Bayesian AppRoximation (FUBAR)—were employed to detect codon-specific positive selection [25,26,27]. Codon sites meeting significance thresholds (SLAC, FEL, MEME: p < 0.05; FUBAR: posterior probability > 0.95) were considered positively selected.
2.6. Average Shannon Entropy Calculation
To assess overall evolutionary patterns, we quantified genome-wide nucleotide and amino acid diversity by calculating site-specific average Shannon entropy values using the Entropy-One Tool (https://www.hiv.lanl.gov/content/sequence/ENTROPY/entropy_one.html, access on 18 July 2025). To evaluate subgroup differences, we compared site-level entropy distributions between RSV-A and RSV-B for each gene using Mann–Whitney U tests. To account for multiple comparisons across genes, p-values were adjusted with the Benjamini–Hochberg procedure, and results are reported as q-values. Genes with q < 0.05 were considered statistically significant. We also calculated 95% confidence intervals for mean entropy (Supplementary Table S4).
3. Results
3.1. Patient Demographics and Sample Distribution
The sample distribution was not uniform throughout 2018 to 2024, and RSV samples were not available in 2021 due to low prevalence. Across the study period, all available RSV-positive samples with accompanying metadata were collected, resulting in 1173 unique patient samples with an approximately 1:1 male-to-female ratio (Table 1). Most samples were from infants (333/1173, 28.4%), children 1–5 years (412/1173, 35.1%), and older adults ≥ 60 years (186/1173, 15.9%). A total of 576 (49.1%) patients had one or more comorbidities. The most prevalent comorbidities were cancer (465/1173, 39.6%), non-asthmatic chronic lung disease (377/1173, 32.1%), and immunosuppression (309/1173, 26.3%). Healthcare utilization varied by year, with emergency department visits and ICU-level care being consistently high. Hospital admissions and supplemental oxygen use declined noticeably after 2022. However, we could not directly quantify the contribution of changing testing frequency to the observed temporal trends in clinical severity.
For each respiratory season, we randomly sampled without replacement around 95 RSV-positive samples from the pool of eligible samples for sequencing. This number was chosen because it corresponds to exactly one sequencing run and, in some years, the total number of specimens archived was limited. Another consideration was to avoid overrepresenting any single year, which could artificially skew estimates of yearly genetic diversity in downstream analyses. Throughout the study timeframe, a total of 451 RSV genomes out of 570 genomes were successfully recovered, comprising 310 RSV-A (68.7%) and 141 RSV-B (31.3%) cases (Figure 1). Between 2018 and 2024, RSV-A and RSV-B exhibited fluctuating predominance. RSV-B dominated in 2018 (40 vs. 16 RSV-A) and 2023 (57 vs. 19 RSV-A), whereas RSV-A was the major subgroup in 2019 (74 vs. 5 RSV-B), 2020 (54 vs. 9 RSV-B), 2022 (52 vs. 13 RSV-B), and 2024 (95 vs. 17 RSV-B). The largest seasonal cohort was in 2024 (n = 112), whereas 2018 had the fewest samples (n = 56), largely due to inadequate RNA quality and low viral loads that limited sequencing success. Our specimens were derived from routine clinical testing within JHHS, and samples selected for sequencing as well as recovering high-quality genomes reflect changes in the JHHS respiratory testing volumes and RSV positivity trends [16,28,29].
3.2. Phylogenetic Analysis of RSV
Phylogenetic analysis revealed considerable genetic diversity in RSV, identifying a total of 15 RSV-A clades and 5 RSV-B clades circulating in Maryland from 2018 to 2024 (Supplementary Table S5). RSV clade circulation patterns demonstrated notable yearly fluctuations (Table 2). RSV-B predominated in 2018 (40/56, 71.4%), primarily driven by clade B.D.4.1.1 (32/40, 80.0%), while RSV-A cases were less frequent but exhibited greater clade diversity, led by A.D (10/16, 62.5%). RSV-A became predominant in subsequent years, particularly in 2019 (74/79, 93.6%) and 2020 (54/63, 85.7%), driven largely by the clade A.D.1 (61/74, 82.4% in 2019; 43/54, 79.6% in 2020), whereas RSV-B detections drastically declined and consisted solely of B.D.4.1.1. By 2022, RSV-A continued as the primary subgroup (52/65, 80.0%) but experienced clade diversification, with A.D.5.2 emerging as the predominant clade (24/52, 46.2%), and RSV-B was dominated by B.D.E.1 (12/13, 92.3%). RSV-B further expanded to dominance in 2023 (57/76, 75.0%), largely comprising clade B.D.E.1 (53/57, 93.0%), while RSV-A decreased in prevalence, primarily represented by A.D.5.2 (7/19, 36.8%). In 2024, RSV-A again became highly dominant (95/112, 84.8%), driven notably by clade A.D.1.6 (62/95, 65.3%), whereas RSV-B circulation was limited and predominantly included clade B.D.E.1 (16/17, 94.1%). All RSV-A sequences obtained during the study period were classified as the ON1 genotype (also referred to as GA2.3.5 in the G-gene clade nomenclature; Figure 2A), whereas all RSV-B sequences were assigned to the BA9 genotype (also referred to as GB5.0.5a; Figure 2B).
3.3. Shannon Entropy, Estimated Evolutionary Rate, and Bayesian Dated Phylogenetic Tree
At the nucleotide level (Figure 3A), the G gene exhibited the highest average entropy for both subgroups (0.045 in RSV-A; 0.035 in RSV-B). The SH (0.026) and M2-2 (0.039) genes of RSV-A also showed elevated average nucleotide entropy relative to other genes such as L (0.010). At the amino acid level (Figure 3B), the G protein again displayed the greatest variability, with an average entropy of 0.078 in RSV-A and 0.060 in RSV-B. Beyond G, only the M2-2 protein (0.037) demonstrated moderate amino acid diversity in RSV-B.
Between subgroup comparisons of site-level entropy distributions within each gene showed that RSV-A had significantly greater nucleotide diversity than RSV-B in F, G, M, M2-1, M2-2, N, NS1, and P (q < 0.05), whereas RSV-B showed higher nucleotide entropy than RSV-A in L (q < 0.05). At the amino acid level, RSV-A was significantly more diverse than RSV-B in G, L, M2-1, N, and P, while RSV-B showed greater variability than RSV-A in M2-2 (q < 0.05). Although SH and F showed slightly higher average entropies in RSV-B, these differences were not statistically significant.
The mean evolutionary rate of the RSV-A dataset, estimated using an uncorrelated lognormal relaxed molecular clock, was 7.07 × 10^−4^ substitutions per site per year (95% HPD: 5.52 × 10^−4^ to 8.56 × 10^−4^). The estimated time to the most recent common ancestor (tMRCA) for all our RSV-A sequences was approximately 18.5 years before the most recent sample (95% HPD: 12.8–25.5 years) (Figure 4A). For all our RSV-B sequences, the mean evolutionary rate was slightly higher compared to our RSV-A samples, at 1.02 × 10^−3^ substitutions/site/year (95% HPD: 8.01 × 10^−4^–1.06 × 10^−3^), with a tMRCA of approximately 10.9 years (95% HPD: 9.78–12.16 years) (Figure 4B). Since all RSV-A genomes recovered in this study belonged to the ON1-derived genotype and all RSV-B genomes belonged to BA9, these tMRCA estimates reflect the most recent common ancestors of the sampled ON1 and BA9 viruses in our dataset, rather than the original emergence times of ON1 or BA9 globally.
3.4. Population Dynamic Analysis
To explore changes in RSV genetic diversity over time, we performed BSP reconstruction using a coalescent-based model implemented in BEAST2. The analysis revealed substantial temporal fluctuations in the effective population size (Ne) between 2018 and 2024 (Figure 5A). Ne was broadly stable across the study period with a single pronounced expansion peak in each subgroup, and we interpret these results as changes in genetic diversity. A sharp decline in genetic diversity of RSV-A was observed between 2018 and 2019, followed by a period of relative stability from 2019 to 2021. A notable expansion of RSV-A occurred in early 2022, peaking throughout the year. This was followed by a sharp decline in RSV-A diversity in 2023 and stabilization at a lower level in 2024. BSP analysis also revealed substantial temporal variation in the effective population size of RSV-B between 2018 and 2024 (Figure 5B). A marked expansion in genetic diversity of RSV-B was observed in mid-2018 to -2019, followed by a sustained period of a relatively stable population size through 2021. However, a dramatic reduction in RSV-B in effective population size occurred during the 2021–2022 period. This population bottleneck reached its lowest point in late 2021, after which the RSV-B population rebounded rapidly, returning to pre-pandemic diversity levels by 2022 and remaining stable in 2023 and 2024. Using a Bayesian Skyline model with 5 temporal bins, the inferred Ne trajectories were broadly stable over time with a single prominent expansion peak for each subgroup. We interpret these skyline constructions as coarse changes in genetic diversity rather than fine-scale temporal fluctuations. Because no RSV genomes were available from 2021, skyline estimate through 2021 is only informed by sequences sampled before and after the gap and therefore may have reduced temporal resolution within this interval.
3.5. Site-Specific Selection Analysis and Positive Selection Sites
To investigate selective pressures on the RSV genome, the average ratio of dN/dS was calculated for each gene (Table 3). Multiple genes exhibited average dN/dS ratios > 1, suggesting possible diversifying selection. In RSV-A, the M (1.016), SH (1.218), G (1.149), and F (1.066) genes showed signs of positive selection. Similarly, in RSV-B, the SH (1.116), G (1.113), F (1.151), and M2-2 (1.406) genes demonstrated evidence of positive selection. Other genes showed either negative or neutral selection.
Site-specific analyses further supported these findings. For the G protein, codons 71, 133, 154, 178, 225, 255, 273, and 314 in RSV-A, and 217, 276, and 285 in RSV-B were detected using at least one method (SLAC, FEL, FUBAR, or MEME). In RSV-A, only codon 273 in the G protein was supported by all four models, providing strong evidence of diversifying selection (Table 4). In RSV-B, codon 217 in the G gene was the only site identified as positively selected by all models. Additionally, only codon 245 in the RSV-A F gene was detected by MEME, suggesting episodic selection. No positively selected codons were identified in other proteins, including M, SH, or M2-2, by site-level analysis despite high average dN/dS ratios in some of these genes.
4. Discussion
The estimated evolutionary rate for RSV-A in our study falls within the expected range and is consistent with previously reported values, which generally fall between 6.72 × 10^−4^ and 7.69 × 10^−4^ substitutions/site/year based on a multi-country whole-genome phylodynamic analysis [30]. Despite higher average entropy at the nucleotide level and the amino acid level in RSV-A, the mean evolutionary rate for RSV-A in our study was slightly lower than that of RSV-B, suggesting that RSV-B may be evolving more rapidly within the local population. Consistent with prior studies from China, Italy, Japan, and South Korea, RSV-B displayed higher evolutionary rates than RSV-A, with reported values ranging from ~1.0 × 10^−3^ to 4.6 × 10^−3^ substitutions/site/year [31,32,33,34]. Previous Kenyan and Korean studies have also reported that RSV-B BA9 exhibited higher evolutionary rates than RSV-A ON1 locally [35,36]. However, a study from Senegal reported a higher ON1 substitution rate when compared to the BA9 genotype [37]. These differences likely reflect the use of gene-specific datasets, which often yield elevated rate estimates compared to whole-genome analyses such as ours, as well as potential regional variation in epidemic dynamics and the specific time periods analyzed. This could also be due to RSV-A exhibiting a broader quasispecies cloud and a larger variant pool; however, many of those variants may remain at a low frequency and fail to achieve selective advantage [38].
Between 2018 and 2024, RSV circulation was globally dominated by ON1-derived RSV-A and BA9-derived RSV-B, with geography-specific fluctuations among their clades but no evidence of replacement by novel genotypes [39]. Since 2020, the adoption of a finer clade classification has enabled studies to resolve clade dynamics with greater resolution [4]. Our analysis revealed patterns largely consistent with international reports but also exhibited distinct local features. RSV-B in our region was stably dominated by clade B.D.E.1 from 2022 through 2024, paralleling its rise in Minnesota, Beijing, Sicily, and Mexico, and consistent with global surveillance analyses identifying B.D.E.1 as a major clade driving RSV-B spread worldwide [39,40,41,42,43,44]. For RSV-A, clade A.D.5.2 was common during 2022–2023, in line with observations from these same regions, but by 2024 it was largely dominated locally by A.D.1.6, a clade not yet reported as dominant elsewhere. The subgroup fluctuations and clade turnover here likely reflect post-COVID changes in transmission. Non-pharmaceutical interventions can suppress circulation and reduce diversity, amplifying founder effects and re-seeding dynamics when mixing resumes, producing short-term dominance of particular clades even without clear evidence of a major adaptive advantage. We therefore interpret the observed subgroup predominance and clade turnover as results of changing population susceptibility and contact patterns and stochastic re-seeding after bottlenecks, rather than as evidence for a selectively drive genotype replacement.
The older tMRCA estimated for RSV-A compared to RSV-B in our dataset likely reflects distinct evolutionary histories of their dominant circulating clades, ON1 and BA9, respectively. ON1, first detected in 2010, is thought to have emerged several years earlier and may have circulated cryptically before global dissemination, as exemplified in emerging pathogens such as SARS-CoV-2 [45]. Its prolonged persistence and the presence of multiple co-circulating clades likely contributed to the deeper ancestral structure observed in RSV-A. In contrast, the more recent tMRCA of RSV-B may be attributed to stronger population bottlenecks and rapid clade turnover within the BA genotype [46]. The dominance of BA9 in recent years may reflect a clade replacement event that erased older genetic diversity, resulting in a more recent common ancestor. Nonetheless, our tMRCA estimates for RSV-A and RSV-B align with other regional studies. Kenyan whole-genome analysis placed the ON1 ancestor around 2005, while U.S. genomic data from Massachusetts traced RSV-A to 2009 and RSV-B to 2016, and Washington State surveillance similarly suggested that both subgroups had been diversifying for over a decade [47,48,49].
The population dynamics we observed in Maryland closely mirror global reports of RSV disruptions and rebounds during the COVID-19 pandemic. In Australia, RSV circulation was nearly absent during the winter of 2020, followed by large out-of-season outbreaks in 2021–2022, with genomic analyses showing reduced clade diversity and the expansion of a few RSV-A clades [50]. Similar patterns were described in New Zealand, where RSV was nearly eliminated in 2020–2021 and then reintroduced after border reopening, initially with low diversity that broadened as circulation resumed [51]. In the United States, national surveillance and regional genomic studies documented off-season resurgences in 2021 and intense early epidemics in 2022, driven by multiple co-circulating clades rather than emergence of a novel strain [49,52,53]. Our findings of RSV-A diversity in 2022 and RSV-B predominance in 2023 align with these national patterns. Across Europe, delayed and shifted epidemics in 2020–2021 were followed by RSV-B predominance in 2022–2023, as observed in France, Iceland, and Italy, paralleling the subgroup shifts we detected in Maryland [54,55,56]. However, the number of genomes sequenced varied substantially by year, skyline-derived Ne estimates for sampled years are less precise and may be over-smoothed.
Although preliminary average dN/dS analyses suggested possible positive selection sites in the G, F, SH, M, and M2-2 genes, only the G and F genes consistently demonstrated robust evidence of positive selection across four detection models. Given their roles as major antigenic targets, it is not surprising that the G and F genes are under diversifying selection [57]. In contrast, internal proteins such as N and P are functionally constrained and thus evolve under purifying selection to preserve essential viral replication and assembly functions [58].
In this study, fewer positively selected sites in the G gene were identified compared to previous reports, potentially due to stringent criteria applied in our analysis (p-value < 0.05, posterior probability > 0.95) [44,59]. Despite this quantitative difference, the positively selected sites identified consistently mapped to the first and second hypervariable regions of the G gene as reported by other RSV phylodynamic studies [40,59]. Eight G codons were found to be positively selected for RSV-A, with codon 273 being the only site identified as significant by all four models. This codon has been previously recognized in multiple studies as a hotspot for mutations [60,61,62]. Notably, codon 273 corresponds to an N-glycosylation site that is completely lost in the ON1 genotype [63,64]. Glycosylation can be under significant selective pressure from factors such as immune evasion, so the loss of glycosylation at this site may expose neighboring residues to immune surveillance [65]. Glycosylation of the RSV G protein plays a key role in shielding antigenic epitopes from neutralizing antibodies; its removal likely reflects a trade-off between immune evasion and structural flexibility [66]. However, no mutational analysis was performed to validate the impact of substitutions at this position. Four of the positively selected sites (codon 133, 225, 255, and 273) in RSV-A strains were located within previously predicted ON1 B-cell epitope regions, suggesting their potential immunological relevance [67]. Some positively selected G sites (codon 71, 225, 273, 314) were consistent with previous studies, with some sites (codon 133, 154, 178, 255) being first detected [36,59,68,69]. However, some studies suggest that the high entropy and genetic diversity of the RSV G protein may not be primarily driven by immune-mediated antigenic drift but instead may arise from neutral evolutionary processes such as genetic drift following population bottlenecks [70]. One contributing factor of predicted positive selection sites is the limited immunogenicity of strain-specific epitopes in the G protein’s C-terminal region, which tend to elicit weak neutralizing antibody responses [8]. Therefore, most potent and broadly neutralizing responses target the conserved prefusion F protein, as well as the central conserved domain of G [71,72]. Among RSV-B strains, three G sites were under positive selection, with codon 217 identified as significant across all four models. Codon 217 exhibited high frequency and Shannon entropy in a study from Cameroon, further supporting its role as a variable immunogenic region [73]. Codons 217, 276, and 285 in the RSV-B G protein have been repeatedly identified under positive selection [62,69,74]. However, we did not perform experimental functional validation such as site-directed mutagenesis to directly test the impact of substitutions at these positions of interest.
In our analysis of the RSV-A F gene, only codon 245 showed evidence of episodic selection by the MEME method. Although this site was not identified by all methods, it remains noteworthy to highlight because the F protein is relatively conserved and represents a key antigenic target for vaccines and monoclonal antibodies. Codon 245 lies within the F1 subunit of the F protein but outside the six major antigenic sites (Ø, I, II, IV, V, and VI) [57]. It does not overlap with any well-characterized neutralizing epitopes [75]. This suggests that adaptive change at this site may occur in more peripheral regions of F without directly altering mapped antibody-binding sites. In line with this, recent studies of the F protein have shown that certain residues outside canonical epitopes can act as conformational switches, altering the local or global dynamics of F and thereby influencing how neutralization epitopes are displayed to the immune system [7,76]. The identification of codon 245 as a positively selected site is therefore intriguing, as it has not been highlighted in previous reports. This novel site could reflect a localized immune pressure or functional adaptation unique to the viruses in our dataset. No experimental mutagenesis was performed on codon 245, so the impact of its variation remains unverified.
This study has several limitations. Uneven sampling across years may have influenced our phylodynamic and selection inferences by over-weighting clades from more densely sampled periods and under-representing circulating diversity in sparsely sampled years. The absence of 2021 genomes create a key temporal gap during the COVID-19 transition, limiting our ability to distinguish persistence from re-introduction and potentially smoothing skyline-derived Ne trajectories across 2020–2022. Limited subgroup representation from fewer RSV-B genomes reduces precision for subgroup-specific evolutionary rate and tMRCA estimates and lowers power to detect episodic or low-frequency positively selected sites. Since vaccination status and receipt of monoclonal antibody prophylaxis were unavailable and the dataset reflects medically attended infections, we could not stratify analyses by immune exposure or perform formal clinical-genomic association testing, which limits interpretation of antigenic selection signals and clade-severity relationships. Finally, codon-based selection methods can be sensitive to alignment uncertainty and uneven temporal sampling; thus, identified sites should be viewed as hypothesis-generating without experimental validation.
Despite these limitations, our analysis captures the evolutionary trajectories of RSV-A and RSV-B circulating in Maryland over seven years, demonstrating subgroup shifts, clade replacement, and variable selective pressures. Our findings also have direct implications for RSV preparedness and prevention policy. Since subgroup fluctuation and clade turnover in Maryland occurred largely within established ON1- and BA9-derived clades rather than through replacement by a novel genotype, the greatest standardized genomic surveillance is temporally even and linked to routine diagnostics, rather than episodic sequencing only during peaks. Integrating sequencing outputs with immunization and prophylaxis data would strengthen prevention strategies because locally informed surveillance can help optimize the timing and targeting with clinical guidance.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Li Y. Wang X. Blau D.M. Caballero M.T. Feikin D.R. Gill C.J. Madhi S.A. Omer S.B. Simoes E.A.F. Campbell H. Global, regional, and national disease burden estimates of acute lower respiratory infections due to respiratory syncytial virus in children younger than 5 years in 2019: A systematic analysis Lancet 20223992047206410.1016/S 0140-6736(22)00478-035598608 PMC 7613574 · doi ↗ · pubmed ↗
- 2Kaler J. Hussain A. Patel K. Hernandez T. Ray S. Respiratory Syncytial Virus: A Comprehensive Review of Transmission, Pathophysiology, and Manifestation Cureus 202315 e 3634210.7759/cureus.3634237082497 PMC 10111061 · doi ↗ · pubmed ↗
- 3Gaymard A. Bouscambert-Duchamp M. Pichon M. Frobert E. Vallee J. Lina B. Casalegno J.-S. Morfin F. Genetic characterization of respiratory syncytial virus highlights a new BA genotype and emergence of the ON 1 genotype in Lyon, France, between 2010 and 2014 J. Clin. Virol.2018102121810.1016/j.jcv.2018.02.00429471266 · doi ↗ · pubmed ↗
- 4Goya S. Galiano M. Nauwelaers I. Trento A. Openshaw P.J. Mistchenko A.S. Zambon M. Viegas M. Toward unified molecular surveillance of RSV: A proposal for genotype definition Influenza Other Respir. Viruses 2020142742853202242610.1111/irv.12715 PMC 7182609 · doi ↗ · pubmed ↗
- 5Goya S. Ruis C. Neher R.A. Meijer A. Aziz A. Hinrichs A.S. von Gottberg A. Roemer C. Amoako D.G. Acuña D. Standardized Phylogenetic Classification of Human Respiratory Syncytial Virus below the Subgroup Level Emerg. Infect. Dis.2024301631164110.3201/eid 3008.24020939043393 PMC 11286072 · doi ↗ · pubmed ↗
- 6ŠimičićP. Židovec-Lepej S. A Glimpse on the Evolution of RNA Viruses: Implications and Lessons from SARS-Co V-2Viruses 202215110.3390/v 1501000136680042 PMC 9866536 · doi ↗ · pubmed ↗
- 7Oraby A.K. Stojic A. Elawar F. Bilawchuk L.M. Mc Clelland R.D. Erwin K. Granoski M.J. Griffiths C.D. Frederick J.D. Arutyunova E. A single amino acid mutation alters multiple neutralization epitopes in the respiratory syncytial virus fusion glycoproteinnpj Viruses 202533310.1038/s 44298-025-00119-840295799 PMC 12015481 · doi ↗ · pubmed ↗
- 8Trento A. Ábrego L. Rodriguez-Fernandez R. González-Sánchez M.I. González-Martínez F. Delfraro A. Pascale J.M. Arbiza J. Melero J.A. Conservation of G-Protein Epitopes in Respiratory Syncytial Virus (Group A) Despite Broad Genetic Diversity: Is Antibody Selection Involved in Virus Evolution?J. Virol.2015897776778510.1128/JVI.00467-1525995258 PMC 4505632 · doi ↗ · pubmed ↗
