Beyond Variant Evolution: Structurally and Functionally Conserved Regions in the 5′UTR of SARS-CoV-2 as Resilient Antiviral Targets
Andrea Masotti

TL;DR
This paper identifies a conserved region in SARS-CoV-2's genome that could serve as a stable target for antiviral drugs, resistant to mutations.
Contribution
The study proposes using the conserved 5′UTR region of SARS-CoV-2 as a mutation-resistant target for RNA-based antiviral therapies.
Findings
The 5′UTR region of SARS-CoV-2 is highly conserved across all variants, making it a stable target for antiviral drugs.
Computational analysis identified potential miRNA binding sites in the 5′UTR region that could be inhibited to block viral replication.
Endogenous miRNAs like miR-638 and miR-3150b-3p may bind the 5′UTR and promote replication, suggesting chemically modified antisense analogs could be effective.
Abstract
Background: Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a positive-sense RNA virus, and its genome includes a highly conserved 5′ untranslated region (5′UTR). This region contains the so-called ‘leader sequence’, a crucial genomic region responsible for the viral replication and the synthesis of all subgenomic RNAs (sgRNAs). It has been demonstrated that targeting highly conserved genomic regions is essential for developing broad-spectrum antiviral therapies that resist viral mutation and evasion. Hypothesis: Given the high level of nucleotide homology between SARS-CoV and SARS-CoV-2, particularly in essential regions like the 5′UTR, the identification of a perfect sequence alignment across SARS-CoV-2 variants within this conserved region would provide a robust, mutation-resistant target for novel RNA-based drugs, such as small interfering RNAs (siRNAs) or microRNAs…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —Italian Ministry of Health
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMicroRNA in disease regulation · RNA and protein synthesis mechanisms · SARS-CoV-2 and COVID-19 Research
1. Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected almost 780 million people worldwide and led to more than 7 million deaths as of December 2025. Despite the success of vaccines, the continuous development of novel therapeutic strategies to complement vaccination efforts is needed, particularly considering the emergence of variants of concern (VOCs) that exhibit high mutation rates [1,2]. RNA interference (RNAi), mediated by small interfering RNAs (siRNAs) and microRNAs (miRNAs), has emerged as a promising approach that is able to silence viral genes and their expression with high precision [1,3,4]. The use of synthetic siRNAs or exogenous miRNAs represents an attractive modality for antiviral treatment due to their ability to induce post-transcriptional gene silencing, leading to the degradation or suppression of viral RNA [2,5].
The 5′ untranslated region (5′UTR) of the SARS-CoV-2 genome, which contains the crucial leader sequence, represents a particularly relevant target [6,7]. This region is crucial for viral replication and transcription, as the leader sequence is consistently incorporated into all subgenomic RNAs [6,7,8].
It has been previously reported that mutant plasmids with different 5′UTR lengths have different properties: whereas the mutant plasmid lacking the region 1–36 did not display a markedly different activity, the plasmid lacking almost all of the UTR (i.e., the 1–222 region) completely abolished the SARS-CoV promoter activity in human cells [9,10]. It has long been known that subgenomic mRNAs lacking the 5′ leader sequence are not able to replicate, and one of the possible explanations is the presence of at least four stem loops located in the 5′-end region of the coronavirus genome; these secondary structures are actively implicated in viral replication and transcription [11].
Prior research against the related SARS-CoV demonstrated that RNAi successfully inhibited viral replication by targeting the leader sequence and other regions [12,13]. This was also reported by my group in recent years [5], where we calculated that the 5′UTR exhibited approximately 88.76% similarity between SARS-CoV and SARS-CoV-2.
Many in vitro studies have shown that small interfering RNAs (siRNAs) can suppress viral replication by targeting distinct regions of the SARS-CoV genome and reducing viral messenger RNA production. Among the regions tested, siRNAs targeting the Spike protein proved most potent, though sequences directed against the leader sequence, transcription regulatory sequence (TRS), and 3′ untranslated region (3′UTR) also successfully prevented viral infection in Vero-E6 cell cultures [13]. In another work, Li and colleagues demonstrated that a leader sequence-specific siRNA similarly achieved effective viral inhibition in Vero-E6 cells [12]. While this strategy did not constitute a complete antiviral therapy—as treated animals still exhibited clinical symptoms—it nevertheless represents a viable approach for reducing viral burden and disease severity.
A few years ago, we also emphasized the highly conserved genomic region within the first 90 nucleotides that encompasses the transcription regulatory sequence (TRS) (nucleotides 40 to 85) and is identical in both SARS-CoV and the initial SARS-CoV-2 isolates analyzed, thus establishing a highly stable therapeutic target [5]. siRNAs targeting the leader sequence have already demonstrated perfect homology against the Delta variant and maintained activity against the Alpha variant, confirming the stability of this target site [7]. This observed genomic stability reinforces the 5′UTR as a robust, broad-spectrum therapeutic target, highly suitable for the design of mutation-resistant non-coding RNA-based drugs, such as siRNAs and antisense oligonucleotides (ASOs) [5,7,14].
Another viable strategy is to target the 5′UTR regions of the viral RNA where the nucleotides form characteristic secondary structure motifs called pseudoknots [15,16]. For several years, viral RNA pseudoknots have been recognized as widespread motifs with different functions in gene expression and viral genome replication [17]. One example is represented by the frameshift stimulation element (FSE) in SARS-CoV-2, that the virus employs to express fundamental proteins such as RNA-dependent RNA polymerase (RdRp) and other essential non-structural proteins [18].
Given the many SARS-CoV-2 genomic variants that have appeared in the last few years and are continuously arising [1], the novelty of the present opinion resides in the demonstration that this specific, highly conserved TRS region that I identified not only is a valuable targetable region but also maintains perfect sequence alignment across the multiple subsequent and divergent SARS-CoV-2 variants that have appeared so far [14,19]. Therefore, it is reasonable to argue that the structural and functional constraints of specific regions of the 5′UTR will ensure conservation across forthcoming SARS-CoV-2 variants, rendering them resilient and exploitable targets for antiviral interventions.
2. Materials and Methods
2.1. Sequence Acquisition
The reference genome sequence for SARS-CoV-2, specifically the Wuhan-Hu-1 strain (NCBI accession NC_045512 or NC_045512.2), was acquired for use as the primary template. Comparative sequence analyses also utilized the SARS-CoV reference genome (NCBI NC_004718 or AY310120.1). To extend the conservation analysis to circulating viral lineages, genomic sequences for multiple SARS-CoV-2 variants, including the Alpha (B.1.1.7), Beta (B.1.351), Kappa (B.1.617.1), Delta (B.1.617.2), Epsilon (B.1.427/B.1.429), Gamma (P.1), Eta (B.1.525), Iota (B.1526), Omicron (B.1.1.529, BA.1, BA.2, BA.3, BA.4 and BA.5), Mu (B.1.621), Zeta (P.2), Theta (P.3) and Lambda (C.37) variants, were obtained from the public repository Global Initiative of Sharing All Influenza Data (GISAID) by downloading all sequences available up to June 2025.
Some of these variants were also reported in the literature [2,7,14,19,20].
2.2. Comparative Sequence Alignment and Conservation Mapping
Multiple sequence alignment (MSA) was performed to compare the downloaded full-length genomic RNA sequences of SARS-CoV-2 variants. The Clustal Omega alignment software (1.2.4) was employed for the alignment. Focus was placed specifically on the conservation of the 5′UTR, including the crucial leader sequence and the transcription regulatory sequence (TRS) previously identified [5]. Therefore, the initial part of the 5′UTR region ranging from 0 to 500 bases was trimmed and aligned (Supplementary Figure S1). The alignment gave a conserved pattern starting from base 30, so that the further comparison was made from 30 to 110 (Supplementary Figure S2). To restrict the analysis and focus on the leader sequence, selected sequences were centered around positions 30–100 (Figure 1).
2.3. Secondary Structure Prediction and Pseudoknot Formation
The formation of pseudoknots in the 5′UTR genomic region was assessed by running the online tool “RNAstructure”, a web server for RNA secondary structure prediction (https://rna.urmc.rochester.edu/RNAstructureWeb/Servers/ProbKnot/ProbKnot.html accessed on 26 February 2026). The formation of secondary structures and pseudoknots is reported in Supplementary Figure S3.
3. Results
Figure 1 presents a sequence alignment focusing on a portion of the 5′UTR of SARS-CoV-2 variants, specifically spanning nucleotides 30 through 102. The alignment compares isolates that circulated from the initial pandemic burden to June 2025. In particular, the total number of analyzed sequences was 140, and the global mean identity was very high (98.09%), as shown in Table 1. Results indicate a high homogeneity in the majority of countries (i.e., France, Ireland, Netherlands), with an internal identity greater than 99%, suggesting a local circulation of very similar variants.
However, Germany and Spain displayed a higher internal variability (minimal identity of 71.34% and 76.37%, respectively), suggesting the presence of multiple distinct lineages. Northern Ireland has the lowest mean identity relative to other countries (94.89%), which is suggestive of peculiar genomic characteristics compared to the European mean data.
The alignment demonstrates a region of substantial conservation between these distinct variants (Figure 2). This region is significant because it encompasses the TRS (outlined in blue) and the leader sequence (outlined in red), which are crucial for viral replication and transcription and remain largely conserved in the later-stage variants analyzed (Figure 1). A larger portion of the genomic region and those variants not selected for clarity reasons have been reported in Supplementary Figures S1 and S2, but the conclusions that I reached are essentially the same. The full percent identity matrix is available as Table S1 in Supplementary Materials.
Additionally, the 5′UTR region is also able to form pseudoknots (Supplementary Figure S3) that could represent regions potentially targetable by ASOs or siRNAs/miRNAs, thus increasing further the therapeutic arsenal.
4. Discussion
The ongoing nature of the COVID-19 pandemic, coupled with the frequent emergence of new VOCs that demonstrate high mutation rates, requires alternative therapeutic strategies that could complement vaccination efforts [1,2,19]. Viral mutations, particularly in highly immunogenic genomic regions like the Spike (S) protein, can reduce the efficacy of neutralizing antibodies and certain small-molecule antivirals, facilitating viral evasion [2,7,14].
RNA interference (RNAi), leveraging highly specific synthetic siRNAs or exogenous miRNAs antisense oligos, offers a promising approach owing to the targeted silencing and degradation of viral RNA transcripts [1,5,21]. For RNAi-based drugs to maintain efficacy across a continually evolving spectrum of viral strains, they must target sequences essential for viral fitness that are inherently resistant to mutation [14,22]. However, one should not forget that RNAi suffers from some drawbacks such as the lack of efficient delivery to respiratory tissue, potential off-target effects, activation of the innate immune response, and other potential regulatory barriers [23] that in some cases can be adequately approached and solved by using modified oligonucleotides (personal unpublished data).
The 5′UTR of the SARS-CoV-2 genome, in particular the region that contains the TRS and the leader sequence, has been established as a valuable target as it satisfies these criteria [5,6]. The leader sequence is essential for viral replication and transcription. In fact, it is incorporated into all sgRNAs required for the synthesis of structural proteins [7,8].
Our initial study reported the genetic stability of this region and outlined that the 5′UTR shares approximately 88.76% similarity between SARS-CoV and SARS-CoV-2, with key segments such as the TRS being identical in the considered genomes [5]. The crucial importance of the present study lies in validating that this high degree of conservation persists even among widely circulating and more recent variants. The alignment data presented in Figure 1 for isolates circulating from 2020 to 2025 confirms that the core sequence in the 5′UTR remains highly stable and conserved despite the increased number of Alpha, Delta, and various Omicron subvariants. Our results emphasize the role of the 5′UTR as a potential mutation-resistant therapeutic target, and a specific region between 30 and 100 nucleotides as the most important. The main key advantages for therapeutic development using siRNAs and miRNAs are manifold: (i) by targeting an identical sequence across all circulating variants, the resulting RNAi therapeutic is inherently ‘broad-spectrum’, offering protection not only against known variants but also potentially against future coronaviruses; (ii) targeting conserved sites outside hypervariable regions (like Spike) greatly reduces the chance of emerging viral escape mutants, thereby enhancing the long-term viability of the drug; (iii) a potent mechanism of action: the leader sequence is present on both the genomic RNA and all sgRNAs, so that by targeting this region one can ensure the simultaneous inhibition of replication and the synthesis of structural proteins, leading to a profound reduction in viral replication and load; (iv) the conservation supports strategies exploiting miRNA targeting. If host miRNAs (like miR-4507 or miR-638, which are highly expressed in lung tissue) bind the conserved 5′UTR to promote viral replication, the design of antisense molecules (such as LNA-based GapmeRs, or antago-miRs) to sequester those miRNAs could represent an indirect yet broad-spectrum antiviral strategy.
5. Conclusions
Sustained targeting of these structurally and functionally conserved regions, combined with advances in computational design and delivery systems, positions RNAi as a leading strategy for the development of pan-coronavirus therapeutics with intrinsic resilience to viral evolution.
Finally, the evolutionary and functional constraints governing the 5′UTR provide strong grounds to argue that these conserved regions will be retained across forthcoming SARS-CoV-2 variants, ensuring the sustained relevance of the findings presented here as durable foundations for antiviral target design.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aram C. Firuzpour F. Barancheshmeh M. Kamali M.J. Unveiling the translational and therapeutic potential of small interfering RNA molecules in combating SARS-Co V-2: A review Int. J. Biol. Macromol.202531814520310.1016/j.ijbiomac.2025.14520340513718 · doi ↗ · pubmed ↗
- 2Lee Y. Tsai H. Yeh C. Fang C. Chan M.W.Y. Wu T. Shen C. RNA Interference Approach Is a Good Strategy against SARS-Co V-2Viruses 20221510010.3390/v 1501010036680140 PMC 9862891 · doi ↗ · pubmed ↗
- 3Donia A. Bokhari H. RNA interference as a promising treatment against SARS-Co V-2Int. Microbiol.20212412312410.1007/s 10123-020-00146-w 32875426 PMC 7462657 · doi ↗ · pubmed ↗
- 4Hasan M. Ashik A.I. Chowdhury M.B. Tasnim A.T. Nishat Z.S. Hossain T. Ahmed S. Computational prediction of potential si RNA and human mi RNA sequences to silence orf 1ab associated genes for future therapeutics against SARS-Co V-2Inform. Med. Unlocked 20212410056910.1016/j.imu.2021.10056933846694 PMC 8028608 · doi ↗ · pubmed ↗
- 5Baldassarre A. Paolini A. Bruno S.P. Felli C. Tozzi A.E. Masotti A. Potential use of noncoding RN As and innovative therapeutic strategies to target the 5′UTR of SARS-Co V-2Epigenomics 2020121349136110.2217/epi-2020-016232875809 PMC 7466951 · doi ↗ · pubmed ↗
- 6Pandey A.K. Verma S. An in silico analysis of effective si RN As against COVID-19 by targeting the leader sequence of SARS-Co V-2Adv. Cell Gene Ther.20214 e 10710.1002/acg 2.10733786418 PMC 7995175 · doi ↗ · pubmed ↗
- 7Tolksdorf B. Nie C. Niemeyer D. Rohrs V. Berg J. Lauster D. Adler J.M. Haag R. Trimpert J. Kaufer B. Inhibition of SARS-Co V-2 Replication by a Small Interfering RNA Targeting the Leader Sequence Viruses 202113203010.3390/v 1310203034696460 PMC 8539227 · doi ↗ · pubmed ↗
- 8Kim D. Lee J. Yang J. Kim J.W. Kim V.N. Chang H. The Architecture of SARS-Co V-2 Transcriptome Cell 2020181914921.e 1010.1016/j.cell.2020.04.01132330414 PMC 7179501 · doi ↗ · pubmed ↗
