Designing a novel vaccine against COVID-19 based on spike SARS-Cov-2 notable mutations using immunoinformatics approaches
Somayyeh Rahimnahal, Shahnaz Yousefizadeh, Yahya Mohammadi

TL;DR
This paper presents a new vaccine design targeting SARS-CoV-2 spike mutations using immunoinformatics to ensure broad immunity against multiple variants.
Contribution
A novel computational design of multi-epitope vaccines targeting spike mutations in SARS-CoV-2 variants is proposed.
Findings
Two multi-epitope vaccines, Cov19-B and Cov19-T, were designed with high population coverage for MHC I and II epitopes.
The vaccines showed stable binding with human antibodies through molecular docking and dynamics simulations.
Immune simulations confirmed strong humoral and cellular immune responses triggered by the vaccine constructs.
Abstract
The rapid emergence of SARS-CoV-2 variants with spike protein mutations undermines the effectiveness of current vaccines, necessitating innovative strategies to ensure broad and lasting immunity. This study leverages an immunoinformatics approach to design two multi-epitope vaccine constructs Cov19-B (649 amino acids, 74 kDa) and Cov19-T (465 amino acids, 48 kDa) specifically targeting mutations in the spike protein observed in the Alpha, Beta, Gamma, and Omicron variants. Using sequence data retrieved from NCBI, GISAID, and UniProt, we predicted a range of epitopes, including linear B-cell, cytotoxic T lymphocyte (CTL), helper T lymphocyte (HTL), and IFN-gamma-inducing epitopes, selected for their high antigenicity, solubility, non-allergenicity, and non-toxicity. These epitopes provide extensive global population coverage: 76.83% for MHC I, 87.43% for MHC II, and 93.8% for combined…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1
Fig 2
Fig 3
Fig 4
Fig 5
Fig 6
Fig 7
Fig 8
Fig 9
Fig 10
Fig 11
Fig 12
Fig 13
Fig 14
Fig 15
Fig 16Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsvaccines and immunoinformatics approaches · SARS-CoV-2 and COVID-19 Research · Diverse Scientific Research Studies
Introduction
The COVID-19 pandemic, originating in Wuhan, China, in December 2019, caused severe symptoms including fever, coughing, and dyspnea [1,2]. The COVID-19 pandemic, originating in Wuhan, China, in December 2019, caused severe symptoms including fever, coughing, and dyspnea. Identified as a novel β-coronavirus (SARS-CoV-2) with sequence similarity to bat coronaviruses and MERS-CoV, it prompted global research into its characteristics, transmission, and therapies [3–6]. Its rapid spread highlighted the need for international action. Coronaviruses, including SARS (2002), MERS (2013), and COVID-19, belong to the Coronaviridae family, with SARS-CoV-2 possessing a large genomic RNA (26–32 kb) [4,6–11]. The spike glycoprotein (S-protein) binds to the host’s ACE2 receptor, primarily in lung alveolar cells, facilitating viral entry, replication, and transmission [12–15]. This process involves the release of the viral genome into the host cell, producing structural (spike, envelope, membrane, nucleocapsid) and nonstructural proteins (Replicase 1ab), enhancing viral infectivity [13,16].
The SARS-CoV-2 Spike (S) protein, comprising S1 and S2 subunits, is a key target for vaccines and therapeutics. The S1 subunit’s receptor-binding domain (RBD) facilitates viral entry by binding host cell receptors, while the S2 subunit mediates membrane fusion. Targeting the S protein with neutralizing antibodies or antiviral drugs, particularly against the S2 subunit, can prevent viral entry and reduce infection severity [17–19]. Monoclonal antibodies, such as CR3022, and plasma exchange from infected patients show therapeutic promise, though further studies are needed to confirm efficacy and safety [20]. Various COVID-19 vaccine candidates, including subunit, DNA, mRNA, and live attenuated or inactivated vaccines, aim to enhance safety, efficacy, and scalability to curb the pandemic.Inactivated SARS-CoV-2 vaccines, despite safety concerns like insufficient inactivation and vaccine-associated respiratory disease, have proven effective in clinical trials and are tightly regulated in approved countries [17]. Further research is needed to enhance their safety and efficacy. DNA vaccines, effective against MERS-CoV, use plasmids to produce S protein, activating both humoral and cellular immunity [21,22].
Subunit vaccines targeting recombinant S protein have been successful for SARS and MERS. However, DNA and mRNA vaccines may have limited protective effects due to dysregulated immune responses [23–26].
Multi-epitope vaccines, incorporating diverse antigenic components, offer a safer, more flexible solution, targeting multiple viral strains and generating robust immunity. [27–30]. Therefore, it becomes crucial to anticipate how immunogenic epitopes would interact with MHC. By leveraging the antigenic and immunogenic features of the Spike (S) protein, this methodology aims to maximize the vaccine’s efficacy and specificity while minimizing potential side effects. Therefore, the most important antigenic candidate for fighting SARS-CoV-2 is the glycoprotein S. Antiviral medications have been utilized as therapeutic approaches to reduce the viral burden or inhibit the spread of SARS-CoV-2 [31].
A novel betacoronavirus has been identified as the causative agent of severe acute respiratory syndrome (SARS) as reported by the Chinese Center for Disease Control (China CDC) on 19th December 2019 [1], which has since infected or killed a large number of people. Although multiple vaccines, such as Pfizer BNT162b2, AstraZeneca‘s AZD1222, Moderna’s mRNA-1273, and Janssen‘s Ad26.COV2.S were produced to combat COVID-19 during this period [32–35], highlight the need for broader-spectrum vaccines, including those targeting emerging variants [36]. In this way, it is ready to deal with the latest SARS mutants worldwide.
The ongoing SARS-CoV-2 pandemic, driven by variants like Delta and Omicron, underscores the need for vaccines that provide broad-spectrum protection [37]. The immunopathogenesis of COVID-19 is characterized by dysregulated innate and adaptive immune responses, initiated by Toll-like receptors (TLRs) such as TLR3, TLR4, and TLR7, which recognize viral components and trigger cytokine production (IL-6, TNF-α) [38]. In severe cases, TLR-driven hyperactivation leads to cytokine storms, marked by excessive IL-6 and IFN-γ, contributing to acute respiratory distress and multi-organ failure [39,40]. Adaptive immunity, involving cytotoxic T-lymphocytes (CTLs) and B-cells, is critical for viral clearance but often impaired in severe disease [41]. Current therapeutic measures, including antivirals (remdesivir), monoclonal antibodies (bamlanivimab), and immunomodulators (dexamethasone), offer partial efficacy but are limited against emerging variants [42]. Multi-epitope vaccines targeting conserved epitopes across SARS-CoV-2 proteins can address these challenges by eliciting robust humoral and cellular immunity [37]. Immunoinformatics approaches, using tools like IEDB, NetCTL, and ABCpred, enable precise epitope selection and construct design, as demonstrated in vaccines for SARS-CoV-2 and other pathogens [43,44].
The rapid spread of SARS-CoV-2 and its variants (Delta, Omicron) underscores the need for broadly protective vaccines targeting conserved epitopes to ensure durable immunity [45]. Multi-epitope vaccines, combining cytotoxic T-lymphocyte (CTL), helper T-lymphocyte (HTL), and B-cell epitopes, have emerged as a promising strategy to elicit robust humoral and cellular responses against pathogens like SARS-CoV-2, Yersinia pestis, Pseudomonas aeruginosa, and Mycobacterium tuberculosis [44,46,47]. Immunoinformatics approaches, leveraging tools such as IEDB, NetCTL, Propred, and BepiPred, enable precise epitope selection and construct design, as demonstrated in vaccines for filariasis and other infectious diseases [43]. Recent studies have also incorporated adjuvants like flagellin, CpG, and ribosomal proteins, alongside linkers (AAY, GPGPG) and disulfide engineering, to enhance immunogenicity and stability [44,47].
The rapid evolution of SARS-CoV-2, driven by mutations in its genome, particularly in the spike (S) protein, has led to the emergence of variants such as Delta and Omicron, which exhibit enhanced transmissibility and immune evasion, compromising the efficacy of existing vaccines [1,2]. These mutations, often occurring in the receptor-binding domain (RBD), alter antigenicity, reducing neutralizing antibody effectiveness and necessitating adaptive vaccine strategies [3]. Identifying and characterizing these mutations is critical for developing multi-epitope vaccines that target conserved epitopes across multiple viral proteins, ensuring broad-spectrum protection against diverse strains [4]. In this study, we designed two multi-epitope vaccine constructs, Cov19-B (B-cell-focused) and Cov19-T (T-cell-focused), using β-defensin and PADRE adjuvants, EAAAK and GPGPG linkers, and disulfide engineering to target SARS-CoV-2 variants, building on methodologies validated across multiple pathogens [48].
The availability of massive sequence data from pathogens, as well as the improvement of computational prediction methods, has substantially aided the identification of possible immunogenic epitopes in pathogen proteins in the postgenomic era [49]. This can help with the development of vaccine against certain infections. Recent studies have predicted viable vaccine candidates for viral infections using immunoinformatics approaches [50].
Materials and methods
All steps related to this study are shown in Fig 1.
All performed steps of this study.
Mutations selection
Genomic data from approximately 12 million SARS-CoV-2 sequences, deposited in the GISAID database (https://www.epicov.org) from the start of the pandemic to January 8, 2022, were retrieved. A custom script was used to extract each strain’s sequence from FASTA format files. For each viral strain, nucleotide and amino acid variations were identified through a multi-step process. The SARS-CoV-2 reference genome was indexed using bwa-build, and viral sequences were mapped to this index with bwa-mem [51]. The resulting SAM files were converted to BAM files using Samtools view v1.9, then sorted and indexed with Samtools sort and Samtools index v1.9, respectively, applying default settings. Variant calling was conducted using bcftools call v1.9 [52]. An in-house script was employed to calculate the frequencies of nucleotide and amino acid mutations within each gene.
Linear and discontinuous B-Cell epitope prediction
Linear B-cell epitopes were predicted using the IEDB server (http://tools.iedb.org/bcell/). This server employs an approach that evaluates key polypeptide chain characteristics to determine antigenicity, utilizing amino acid scales and Hidden Markov Models (HMMs) to analyze sequence properties and identify linear B-cell epitopes.
Discontinuous B-cell epitopes were identified using the ElliPro server (http://tools.iedb.org/ellipro/). ElliPro predicts these epitopes based on the three-dimensional structure of a protein antigen, calculating a score derived from the average Protrusion Index (PI) of epitope residues. It uses the PI to detect discontinuous B-cell epitopes and clusters them based on the distance (in angstroms) between the residues’ centers of mass.
CLT epitope prediction
The first step in designing multi-epitope vaccines using bioinformatics is to identify the immunoprotective properties of epitopes. T-cell epitopes, presented by MHC molecules, are linear and typically comprise 9 to –20 amino acid residues, enabling accurate modeling of ligand–T-cell interactions. The binding of MHC molecules to antigenic peptides is a crucial step in presenting them to T-cell receptors. In this study, we employed Artificial Neural Networks to predict T-cell epitopes with high MHC class I binding affinity, utilizing the NetMHCpan4.1 server (http://www.cbs.dtu.dk/services/NetMHCpan/). This tool predicts MHC class I binding with an accuracy of 90–95%.
HLT epitope prediction
To predict interactions between helper T-cell (HTL) epitope peptides and MHC class II molecules, we utilized the NetMHCIIpan 4.0 server (http://www.cbs.dtu.dk/services/NetMHCIIpan/). Thresholds of 2% and 10% were set to identify strong and weak binders, respectively, based on human alleles. This method was designed to assess the binding affinity of T-cell epitope peptides to MHC class II molecules.
Investigation of coverage in population
The IEDB population coverage server (https://tools.iedb.org/population/) was used to evaluate whether the selected vaccine epitopes provided effective coverage across the global population. Considering the global scope of the SARS-CoV-2 pandemic, the analysis included diverse regions worldwide. The assessment was performed using default settings, testing coverage for both HLA classes. This comprehensive approach sought to ensure inclusivity across varied populations.
Epitopes conservation analysis
The IEDB conservancy analysis tool was utilized, so that epitopes could be assessed for conservation at different lengths. This analysis aimed to provide insights into how well the selected epitopes were preserved across related sequences, contributing to a better understanding of their potential effectiveness in diverse contexts.
Physicochemical properties of the vaccine construct
The antigenicity of the vaccine design was confirmed using VaxiJen v2.0109 with a threshold value of 0.4. Whole-protein antigenicity prediction models were derived from viral databases. The server’s models demonstrated robust performance in both validations, achieving prediction accuracies within the range of 70% to 89%. To assess the vaccine’s allergenicity, the AllerTOP server was employed. The vaccine’s allergenicity was further verified using the AllergenFP server (https://ddg-pharmfac.net/AllergenFP/). The ExPASy ProtParam server was utilized to evaluate various physicochemical parameters of the vaccine. Additionally, SignalP4.1 (https://www.cbs.dtu.dk/services/SignalP/) and TMHMM server v2.0 (https://www.cbs.dtu.dk/services/TMHMM/) were employed to check for the presence of signal peptides and transmembrane helices in the vaccine design. These comprehensive analyses contribute to a thorough understanding of the vaccine’s properties and potential efficacy.
In-silico cloning and vaccine codon optimization
To construct effective vaccine components, antigenic epitopes were fused using specific peptide linkers. Three distinct constructions were created for each linear B lymphocyte (LBL), cytotoxic T lymphocyte (CTL), and helper T lymphocyte (HTL). For optimizing codons, the GenSmart™ Codon Optimization available at https://www.genscript.com/tools/gensmart-codon-optimization was employed. The Codon Adaptation for estimating protein expression, were set at 1.0 and 30–70%, respectively. The goal of codon optimization was to maximize expression in strain K-12 Escherichia coli. The rho-independent transcription terminators and bacterial ribosome binding sites were all avoided by setting the parameters. To identify NdeI and XhoI restriction enzyme sites, the SnapGene tool was used.
Homology modeling, and 3D structure validation
The 3D structure of the construct was modeled using I-TASSER (Iterative Threading ASSEmbly Refinement), as developed by I-TASSER.
To improve the accuracy and quality of the 3D model, refinement was conducted using the ModRefiner tool (https://zhanglab.ccmb.med.umich.edu/ModRefiner). This process enhances the precision and reliability of the 3D structure predictions, ensuring an accurate representation of the vaccine construct’s spatial configuration. The stereochemical quality of the protein structure was evaluated using PROCHECK (http://services.mbi.ucla.edu/PROCHECK), which analyzes parameters such as bond lengths, bond angles, and dihedral angles to assess the model’s quality. The SAVES server (http://services.mbi.ucla.edu/SAVES) was employed to calculate the Z-Score for the 3D structures of Cov19-B and Cov19-T. SAVES integrates multiple validation programs to evaluate stereochemical properties, side-chain parameters, and other structural features, ensuring the reliability of the modeled structures. The Ramachandran plot, generated via the MolProbity server (http://molprobity.biochem.duke.edu/), was used to visualize the distribution of phi and psi dihedral angles, facilitating the identification of potential issues in the backbone conformation of the modeled structures.
Disulfide engineering of Cov19-B and Cov19-T
To enhance the structural stability of the vaccine constructs, disulfide bond engineering was performed using the Disulfide by Design 2.0 (DbD2) server. The refined 3D models of Cov19-B and Cov19-T were analyzed to identify residue pairs suitable for disulfide bond formation. Criteria for selection included: (i) spatial proximity of residue pairs (Cα-Cα distance ~4–6 Å), (ii) favorable bond geometry (χ3 torsion angle ~±90°), and (iii) minimal disruption to epitope antigenicity. Non-cysteine residues meeting these criteria were mutated to cysteines to introduce disulfide bonds. The number of disulfide bonds introduced was optimized to balance stability and flexibility, ensuring no interference with MHC binding sites.
Docking of vaccine construct with antibody
The 3D structure of the human-neutralizing antibody (PDB ID: 7BWJ) was obtained from the RCSB server. The 3D structure of 7BWJ was optimized using PyMOL v2.3.4, and hydrogen peroxide molecules were removed from the PDB structures of the vaccine constructs and 7BWJ. To identify interaction zones, the prepared 3D structures of Cov19-B, Cov19-T, and 7BWJ were submitted to the HDOCK server after preprocessing with Chimera. Complexes exhibiting the lowest mean Root Mean Square Deviation (RMSD) and intermolecular binding energy between the vaccine constructs and the heavy and light chains of 7BWJ were selected based on HADDOCK results. LigPlot+ was utilized to evaluate hydrogen bond formation, providing insights into the hydrogen bond interactions within the complexes. This rigorous approach aimed to elucidate potential binding and interaction patterns between the 7BWJ antibody and the Cov19-B/T constructs.
Simulation of complexes by MD
Molecular dynamics simulations of the vaccine-antibody complexes were performed using GROMACS v4.6.5 with the CHARMM36 all-atom force field. The protein–protein complexes were solvated with water molecules, and the system was neutralized. Energy minimization was conducted to stabilize the system’s configuration. Equilibration was carried out under NVT (constant Number of particles, Volume, and Temperature) and NPT (constant Number of particles, Pressure, and Temperature) conditions at 300 K, 1 bar, and restraint forces of 1000 kJ/mol. Production simulations were run for 100,000 ps (100 ns). All bonds were constrained using the LINCS algorithm to maintain stability and prevent unrealistic bond length variations. These procedures collectively enabled the simulation of the dynamic behavior of the vaccine-antibody complexes over an extended period. The use of GROMACS and the CHARMM36 force field provided a robust framework for precise and detailed molecular dynamics simulations.
Estimation of binding-free energies
The MMPBSA method calculates binding free energies in molecular dynamics simulations, offering critical insights into the energetics of protein–protein or protein–ligand interactions. The MMPBSApy module from AMBER16 was utilized to compute Molecular Mechanics/Poisson-Boltzmann Surface Area (MMPBSA) binding free energies for the receptors and the multi-epitope peptide vaccine construct. Input files for the complex, receptor, and vaccine construct were generated using the ante-MMPBSA.py module. From the simulation trajectories, 100 frames were extracted to analyze the energy differences between solvated and unsolvated states. Binding free energies were accurately determined by comparing two distinct conformations and assessing the contributions of key residues to the binding energetics. To calculate the free binding energy (ΔGbind, solv) of the anticipated complex, three equations (Eqn 1, Eqn 2, Eqn 3) were utilized. These equations involve terms related to the solvation-free energy, van der Waals interactions, electrostatic interactions, and entropy contributions [53,54]:
The net free binding energy was broken down into each residue in order to emphasize the stable and interacting residues.
Immune simulation
The immunogenicity and immune response profile induced by the Cov vaccine were evaluated using the C-ImmSim server, which employs machine learning techniques to predict immune responses. A minimum interval of four weeks was set between the first and second doses. The simulation spanned 1050-time steps, with each time step corresponding to approximately eight hours. Three injections were administered at four-week intervals, corresponding to time steps 1, 84, and 168.
Ethics statement
This study does not contain any human/animal samples. All authors agree on the Ethical Approval and Consent to participate.
Results
Sequence retrieve
The spike sequence of SARS-Cov-2, with protein ID of P0DTC2, was selected from UniProt based on previous studies.
Selection of mutation
All mutations with frequencies higher than 1% were extracted from the spike sequence and used for further study (Table 1). A multimer/tetramer structure of the Spike (RCSB code: 6VXX) is shown in Fig 2A. A plot of the frequency of mutations across the spike sequence is shown in Fig 2B.
Table 1: Mutations and frequency related to them.
(A) The multimeric (tetramer) structure of the SARS-CoV-2 Spike protein, derived from the PDB entry 6VXX (RCSB Protein Data Bank), visualized using PyMOL software.The structure highlights the trimeric arrangement of the Spike glycoprotein, with distinct domains including the receptor-binding domain (RBD) and S2 subunit, critical for viral entry and immune recognition. Different chains are color-coded to illustrate the quaternary organization, providing a foundation for identifying conserved regions targeted in the design of the multi-epitope vaccine constructs [55] and visualized by PyMol. (B) Scatter plot depicting the frequency of mutations across the Spike protein sequence (residues 1 to 1400), based on genomic data analysis of SARS-CoV-2 variants. The x-axis represents residue positions, while the y-axis indicates mutation frequency (in percentage). High mutation frequencies, particularly in the RBD (residues ~330–530), underscore the virus’s immune evasion potential, guiding the selection of conserved epitopes for vaccine design to ensure broad-spectrum protection against evolving variants.
Prediction of B-cell epitopes: Linear
Twenty-six linear B-cell epitopes were chosen for their mutations, aiming to stimulate humoral immunity through the vaccine. Predicted B-cell epitopes, and subsequent evaluation of antigenicity was carried out (Table 2). Also, all mutations related to each epitope are shown in Table 2. Mutations were applied to the final construct.
Table 2: B-cell epitope selected alongside their mutation. Overlapped sequence/epitope wit CTL epitopes and HTL epitopes are shown as underlined.
Prediction of the cytotoxic T lymphocyte epitope
Given the importance of fostering cellular immunity, an evaluation of the S protein was carried out. As shown in Table 3, the 73 CTL epitopes were chosen for their antigenicity and binding affinity. All mutations related to each epitope are shown in Table 3. Mutations were incorporated into the final construct.
Table 3: CTL epitope selected alongside their mutation. Overlapped sequence/epitope with B-cell epitopes and HTL epitopes are shown as underlined. Repetitive CTL epitopes are shown in bold characters.
Selection of helper T Lymphocyte Epitopes
40 HTL epitopes were investigated with an emphasis on finding strong-binding and highly antigenic epitopes (Table 4). Furthermore, each selected HTL epitope’s ability to induce IFN-γ is shown in Table 4.
Table 4: HTL epitope selected alongside their mutation. Overlapped sequence/epitope with B-cell epitopes and CTL epitopes are shown as underlined. Repetitive HTL epitopes are shown in bold characters.
Calculating the conservation of epitopes
The S protein showed a noticeably high percentage of conservancy based on the epitope conservancy study for the B-cell, CTL, and HTL selected epitopes. Specifically, conservancy rates for B-cell, CTL, and HTL epitopes ranged from 99% to 100%, 85% to 100%, and 81% to 100%, respectively. [Tables 5–7](#pone.0334662.t005 pone.0334662.t006 pone.0334662.t007) present comprehensive results regarding epitope conservation. Among the four proteins, the S protein showed the highest epitope conservation while the E protein had the lowest. These results suggest that the chosen epitopes are highly conserved.
Table 5: Conservation of B-cell epitopes.
Table 6: Conservation of CTL epitopes.
Table 7: Conservation of HTL epitopes.
Population coverage
For every selected epitope, population coverage was evaluated based on its frequency among different ethnic groups. MHC I, MHC II, and the combined MHC I–MHC II class alleles were considered in the analysis. With an average of 93.8%, the combined class alleles showed the highest population coverage for epitopes overall (S1 Table). As shown in Fig 3A, B and C, it can be seen that the chosen epitopes were present in the populations in 64 different regions.
Comprehensive population coverage analysis of selected epitopes across different geographical regions.The bar charts illustrate the predicted population coverage of selected MHC Class I (red), Class II (blue), and combined (cyan) epitopes across various countries and regions based on HLA allele frequencies. This analysis was performed using the IEDB population coverage tool to evaluate the potential global applicability and effectiveness of the designed multi-epitope vaccine constructs (Cov19-B and Cov19-T). The high levels of combined population coverage observed in most regions (>80%) indicate broad immunogenic potential and support the suitability of the vaccine candidates in diverse genetic backgrounds.
Prediction of B-cell epitopes: Conformational
Additionally, conformational B-cell epitopes from the final structures of Cov19B and Cov19T were analyzed, focusing on identifying any overlap between different types of B-cell epitopes (S2 and S3 Tables)
Construction analyzation
For the first construct (Cov19B), 24 B-cell epitopes were fused using KK linkers. Human Beta-defensin 3 (HBD3) and PADRE sequences were linked with an EAAAK linker. For the second construct (Cov19T), 10 CTL and 8 HTL epitopes were linked via AAY and GPGPG linkers, respectively. The 50S ribosomal protein L7/L12 from Mycobacterium tuberculosis was fused as an adjuvant to the C-terminal of Cov19T via EAAAK linker (Fig 4A & 4B).
Schematic representation of vaccine constructs Cov19B and Cov19T, including linker arrangements, adjuvants, and in silico cloning.(A) Amino acid sequence of the Cov19B vaccine construct showing the arrangement of selected B-cell, CTL, and HTL epitopes linked with appropriate peptide linkers (highlighted in yellow: KK, pink: EAAAK, gray: epitope regions). The sequence includes the PADRE sequence and the Human Beta-defensin 3 (HBD3) as adjuvants. (B) Cov19T construct sequence featuring similar epitope-linker architecture. The CTL and HTL epitopes are joined by GPGPG linkers (cyan), and the AAY linkers (green) connect cytotoxic epitopes to preserve MHC-I processing efficiency. The PADRE and L7/L12 ribosomal protein from M. tuberculosis serve as adjuvants. (C) In silico cloning of Cov19B into the pET26b(+) expression vector using restriction sites NdeI and XhoI, confirming suitability for prokaryotic expression systems. (D) Simulated cloning of Cov19T into the same vector system. Both constructs were codon-optimized and checked for restriction enzyme compatibility using SnapGene software to ensure efficient bacterial express.
XhoI and NdeI restriction sites were incorporated into both Cov19-B and Cov19-T constructs. The vaccine’s 1899 base pair nucleotide sequence was optimized. Following codon optimization, analysis revealed GC-Content of 47.35% for Cov19-B and 57.30% for Cov19-T. These results indicate effective codon adaptation for efficient expression in the selected bacterial strain.
As shown, the constructs were cloned intopET26b(+) for in-silico cloning (Fig 4C & 4D). The restriction enzymes XhoI and NdeI were positioned at the construct’s N- and C-terminals, respectively. Similarly, the same restriction sites were placed at corresponding positions in Cov19-T. Furthermore, a HisTag sequence was added to the C-terminus of each construct. The constructs are in the frame, as demonstrated by examining the final nucleotide sequence performed.
The results of physico-chemical properties indicate that the constructs comprise two proteins with a 649 (Cov19-B) and 465 (Cov19-T) aa with 74 kDa and 48 kDa molecular weight. The theoretical isoelectric point (pI) was determined to be 9.95 and 8.49 for Cov19-B and Cov19-T, respectively. The instability index was calculated as 38.11 for Cov19-B and 23.09 for Cov19-T while the aliphatic index was estimated to be approximately 58.15 and 94.06, respectively, for Cov19-B and Cov19-T. Furthermore, the protein obtained a GRAVY of −0.785 (Cov19-B) and −0.295 (Cov19-T). These calculated physicochemical parameters collectively indicate that both Cov19-B and Cov19-T represent suitable constructs for expression.
Homology modeling and its analyzation
The iterative refinement process enhances the precision and dependability of the predicted structure, leading to a clearer comprehension of the vaccine’s spatial arrangement and possible interactions with target molecules. As shown in Figs 5A and 5B, the amino acid sequence was estimated as the vaccine’s three-dimensional structure. Subsequently, the first 3D structure was refined, resulting in a final structure with a reduced root mean square deviation and improved structural features.
(A) Modeled 3D structure of Cov19B and (B) Modeled 3D structure of Cov19T.
Z-score of −3.9 (Cov19B) and −0.9 (Cov19T) were found using additional validation criteria for the improved 3D structure of (Fig 6A and 6B). Figs 6C and 6D show the Ramachandran graph, which shows that 85% and 72% of amino acids are situated in the preferred and allowed regions, respectively.
Validation of homology modeling for the multi-epitope vaccine constructs Cov19-B and Cov19-T, designed to target SARS-CoV-2.(A) Z-Score analysis of Cov19-B, calculated using ProSA-web, to assess the overall quality of the modeled structure. The Z-Score of −3.9, falls within the range of native protein structures, indicating high structural reliability and consistency with experimentally determined structures of similar size. (B) Z-Score analysis of Cov19-T, similarly computed, with a Z-Score of −0.9, confirming the model’s quality and structural integrity. (C) Ramachandran plot of Cov19-B, illustrating the distribution of phi (φ) and psi (ψ) angles of amino acid residues. The plot reveals 85% of residues in favored regions, indicating a high-quality model with minimal steric clashes. (D) Ramachandran plot of Cov19-T, showing 72% of residues in favored regions, further validating the structural accuracy of the modeled construct.
Although the Ramachandran analysis showed that less than 90% of residues fell within the favored regions for the vaccine constructs, the values remained close to this benchmark and are within acceptable limits for multi-epitope synthetic designs containing numerous flexible linker regions and non-native adjuvant sequences. The structural models were further validated through Z-score evaluation (via SAVES), low MolProbity clashscores, and successful structural refinement using ModRefiner. Moreover, molecular docking and 100 ns molecular dynamics simulations confirmed the constructs’ stability and correct folding, supporting their suitability for experimental expression and immunogenicity studies
Disulfide engineering of vaccine constructs
Disulfide engineering was performed to enhance the structural stability of Cov19-B and Cov19-T. The 3D structures were analyzed to identify residue pairs for disulfide bond formation. Four disulfide bonds were introduced into Cov19-B and two into Cov19-T by mutating spatially proximal residue pairs to cysteines. Pairs were selected based on a disulfide bond formation likelihood (%SS) ≥90%, bond energy ≤5 kcal/mol, and Σ β-Factor ≤15, ensuring stability. For Cov19-B, pairs A194 CYS–A199 PRO (χ3 = +99.50°, bond energy = 1.79 kcal/mol, Σ β-Factor = 8.34), A217 LYS–A221 VAL (χ3 = −76.06°, bond energy = 1.71 kcal/mol, Σ β-Factor = 8.47), A102 ASP–A105 GLN (χ3 = +74.45°, bond energy = 4.26 kcal/mol, Σ β-Factor = 10.16) and A166 ALA–A172 ASN (χ3 = −92.79°, bond energy = 0.54 kcal/mol, Σ β-Factor = 8.57) were selected (S1 Fig).
The pairs A81 ALA–A90 ARG (χ3 = +102.00°, bond energy = 0.20 kcal/mol, Σ β-Factor = 15.42), and A134 ALA–A146 TYR (χ3 = +93.26°, bond energy = 1.40 kcal/mol, Σ β-Factor = 21.20) were selected for Cov19-T (S2 Fig). Post-engineering, both constructs retained high antigenicity, were non-allergenic, and non-toxic. Molecular docking confirmed that disulfide bonds did not disrupt Cov19-B or Cov19-T binding affinities (ΔG < −10 kcal/mol).
Protein-protein docking
The structures of Cov19B and Cov19T were docked with HNA using the HDOCK server to identify interaction regions through protein-protein docking. The highest rankings complexes were chosen. The docking results revealed that Cov19B and Cov19T can interact with human-neutralizing antibodies (HNA). All Four complexes have negative binding energy (S4, S5, S6, and S7 Tables). So, in Cov19B-Heavy chain, Cov19B-Light, Cov19T-Heavy chain, and Cov19T-Light chain complexes had docking scores of −332.07, −323.14, −328.47, and −310.24, respectively.
As demonstrated in Figs 7A and 7C, both Cov19B exhibits remarkable capacity to interact with the heavy and light chains of the HNA antibody. Notably, interactions take place between the HNA antibody’s heavy chain’s HCDR1, HCDR2, and HCDR3 loops as well as its light chain’s LCDR1, LCDR2, and LCDR3 loops. These regions were specifically selected due to their potential to bind diverse sections of the RBD antigen in both Cov19B and Cov19T. These interactions are essential for coordinating the intricate molecular interactions required for the identification of antibodies by antigens, which in turn prepares the immune system to mount a successful defense against the pathogen of interest. As shown in Figs 7B and 7D, the interaction involved amino Tyr856, Tyr507, Lys359, Tyr360, Asn362, Arg583, Phe126, Thr120, Ser116 and Trp 347 from Cov19B with amino acids Ala113, Arg112, Thr147, Val177, Thr179, Leu46, Arg98, Gln1 and Gln3 from the heavy chain of the HNA antibody. Additionally, amino acids Asn 263, Tyr644, Asp623, Asp234, Gln223, Lys241, Val5, Arg8 and Asn636 from Cov19B interacted with amino acids Asn71, Ser183, Thr135, Thr72, Thr19, Arg193, Gln6, Ser191, Lys190, Lys170 from the light chain of the HNA antibody.
Docking visualization of Cov19B-Heavy chain, Cov19B-Light chain, Cov19T-Heavy chain and Cov19T-Light chain.(A) Cov19B structure in green color surface mode and Heavy chain in blue color cartoon mode. (B) Magnified interactions of Cov19B-Heavy chain. (C) Cov19B structure in green color surface mode and Light chain in copper color cartoon mode. (D) Magnified interactions of Cov19B-Light chain. (E) Cov19B structure in purple color surface mode and Heavy chain in blue color cartoon mode. (F) Magnified interactions of Cov19T-Heavy chain. (G) Cov19T structure in purple color surface mode and Light chain in copper color cartoon mode. (H) Magnified interactions of Cov19T-Light chain.
For the Cov19T-Light chain complex, structural analysis revealed 7 hydrogen bonds and 1 salt bridge mediating interactions between Ala448, Lys440, Lys427, Tyr71, Glu428, Val414, Thr495, and Ser243 amino acids of Cov19T and Lys220, Asp222, Lys223, Ser186, Ser134, Asp158, Gln119, and Ser142 amino acids of Light chain (Fig 7E and 7G). The interaction in the Cov19T-Light chain complex exhibited 4 hydrogen bonds responsible for the interaction between Tyr162, Tyr142, Asn121 and Met167 of Cov19T and Phe101, Ser179, Thr120 and Tyr89 of light chain (Fig 7F and 7H).
The 2D interactions are summarized and visualized so that graphical 2D HDOCK results are presented in Fig 8A (Cov19B-Heavy chain) and 8B (Cov19B-Light chain), as well as Fig 9A (Cov19T-Heavy chain) and 9B (Cov19T-Light chain) offering a detailed view of the protein-protein docking outcomes.
(A) 2D presentation of interactions in Cov19B-Heavy chain complex. (B) 2D presentation of interactions in Cov19B-Light chain complex.
(A) 2D presentation of interactions in Cov19T-Heavy chain complex. (B) 2D presentation of interactions in Cov19T-Light chain complex.
Analysis molecular dynamic and Poisson–Boltzmann Born surface area
The docked results underwent MD simulation for 100 nanoseconds. The hydrogen bond in the Cov19B-Heavy chain complex and Cov19B-Light chain complex was formed with a Root Mean Square Deviation (RMSD) of 0.9 and 1 nm, respectively (Fig 10A and 11A). RMSF for the Cov19B-Heavy chain complex and Cov19B-Light chain complex indicated a bit volatility (Fig 10B and 11B). The radius of gyration related to the complexes in both graphs indicated that Cov19B was more stable than the Heavy chain and Light chain (Fig 10C and 11C). As we can see in Fig 10D and Fig 11D number of hydrogen bonds in the Cov19B-Light chain complex was more than those in the Cov19B-Heavy chain complex. SASA per-time graph also demonstrated that Cov19B showed a stable structure compared to Cov19B-Heavy chain complex and Cov19B-Light chain complex (Fig 10E and 11E).
Molecular dynamics (MD) simulation results for the Cov19B-Heavy chain complex, evaluated over a 100 ns trajectory using GROMACS to assess structural stability and interaction dynamics.(A) Root Mean Square Deviation (RMSD) plot, showing the backbone deviation of Cov19B (green) and Heavy chain (blue) over time. Cov19B stabilizes at 0.4 nm after 40 ns, while the Heavy chain reaches equilibrium at 1 nm after 20 ns, indicating stable complex formation. (B) Root Mean Square Fluctuation (RMSF) plot, depicting residue-level flexibility. Cov19B exhibits low fluctuations 0.3 in epitope regions, while the Heavy chain shows higher fluctuations 0.22 nm in non-binding regions, suggesting conformational flexibility. (C) Radius of Gyration (Rg) plot, measuring the compactness of the complex. Both Cov19B and Heavy chain maintain stable Rg values 5 nm for Cov19B and 2.8 for Heavy chain, confirming structural integrity throughout the simulation. (D) Hydrogen bond (H-Bond) analysis, illustrating the number of H-bonds between Cov19B and Heavy chain over time. An average of 2–8 H-bonds is maintained, highlighting strong intermolecular interactions. (E) Solvent Accessible Surface Area (SASA) per-time plot, showing the solvent exposure of Cov19B (green) and Heavy chain (blue). SASA values stabilize at 500 nm for Cov19B and 100 nm² for Heavy chain, indicating minimal unfolding and consistent surface accessibility for immune recognition. These results collectively demonstrate the stability and interaction dynamics of the Cov19B-Heavy chain complex, supporting its potential for effective immune response elicitation.
Molecular dynamics (MD) simulation results for the Cov19B-Light chain complex, evaluated over a 100 ns trajectory using GROMACS to assess structural stability and interaction dynamics.(A) Root Mean Square Deviation (RMSD) plot, showing the backbone deviation of Cov19B (green) and Light chain (Copper) over time. Cov19B stabilizes at 0.4 nm after 20 ns, while the Light chain reaches equilibrium at 1 nm after 30 ns, indicating stable complex formation. (B) Root Mean Square Fluctuation (RMSF) plot, depicting residue-level flexibility. Cov19B exhibits low fluctuations 0.25 in epitope regions, while the Light chain shows higher fluctuations 0.15 nm in non-binding regions, suggesting conformational flexibility. (C) Radius of Gyration (Rg) plot, measuring the compactness of the complex. Both Cov19B and Light chain maintain stable Rg values 3.8 nm for Cov19B and 2.3 for Light chain, confirming structural integrity throughout the simulation. (D) Hydrogen bond (H-Bond) analysis, illustrating the number of H-bonds between Cov19B and Light chain over time. An average of 4–10 H-bonds is maintained, highlighting strong intermolecular interactions. (E) Solvent Accessible Surface Area (SASA) per-time plot, showing the solvent exposure of Cov19B (green) and Light chain (Copper). SASA values stabilize at 400 nm² for Cov19B and 140 nm² for Light chain, indicating minimal unfolding and consistent surface accessibility for immune recognition. These results collectively demonstrate the stability and interaction dynamics of the Cov19B- Light chain complex, supporting its potential for effective immune response elicitation.
Results from the MD simulation revealed that in the Cov19T-Heavy chain complex and Cov19T-Light chain complex, the hydrogen bond was established with a RMSD of 0.5 and 0.6 nm, respectively (Fig 12A and 13A). The RMSF diagrams for the residues in the Cov19T-Heavy chain complex and Cov19T-Light chain complex during the MD simulation showed minimal fluctuations (Fig 12B and 13B).
Molecular dynamics (MD) simulation results for the Cov19T-Heavy chain complex, evaluated over a 100 ns trajectory using GROMACS to assess structural stability and interaction dynamics.(A) Root Mean Square Deviation (RMSD) plot, showing the backbone deviation of Cov19T (purple) and Heavy chain (blue) over time. Cov19T stabilizes at 0.5 nm after 10 ns, similarity the Heavy chain reaches equilibrium at 0.5 nm after 10 ns, indicating stable complex formation. (B) Root Mean Square Fluctuation (RMSF) plot, depicting residue-level flexibility. Cov19T exhibits low fluctuations 0.32 in epitope regions, while the Heavy chain shows higher fluctuations 0.18 nm in non-binding regions, suggesting conformational flexibility. (C) Radius of Gyration (Rg) plot, measuring the compactness of the complex. Both Cov19T and Heavy chain maintain stable Rg values 3 nm for Cov19T and 2.2 for Heavy chain, confirming structural integrity throughout the simulation. (D) Hydrogen bond (H-Bond) analysis, illustrating the number of H-bonds between Cov19T and Heavy chain over time. An average of 2–6 H-bonds is maintained, highlighting strong intermolecular interactions. (E) Solvent Accessible Surface Area (SASA) per-time plot, showing the solvent exposure of Cov19T (purple) and Heavy chain (blue). SASA values stabilize at 320 nm for Cov19T and 130 nm² for Heavy chain, indicating minimal unfolding and consistent surface accessibility for immune recognition. These results collectively demonstrate the stability and interaction dynamics of the Cov19T-Heavy chain complex, supporting its potential for effective immune response elicitation.
Molecular dynamics (MD) simulation results for the Cov19T-Light chain complex, evaluated over a 100 ns trajectory using GROMACS to assess structural stability and interaction dynamics.(A) Root Mean Square Deviation (RMSD) plot, showing the backbone deviation of Cov19T (purple) and Light chain (copper) over time. Cov19T stabilizes at 0.4 nm after 50 ns, similarity the Light chain reaches equilibrium at 0.4 nm after 50 ns, indicating stable complex formation. (B) Root Mean Square Fluctuation (RMSF) plot, depicting residue-level flexibility. Cov19T exhibits low fluctuations 0.30 in epitope regions, while the Light chain shows higher fluctuations 0.25 nm in non-binding regions, suggesting conformational flexibility. (C) Radius of Gyration (Rg) plot, measuring the compactness of the complex. Both Cov19T and Light chain maintain stable Rg values 3.5 nm for Cov19T and 2.5 for Light chain, confirming structural integrity throughout the simulation. (D) Hydrogen bond (H-Bond) analysis, illustrating the number of H-bonds between Cov19T and Light chain over time. An average of 0–2 H-bonds is maintained, highlighting strong intermolecular interactions. (E) Solvent Accessible Surface Area (SASA) per-time plot, showing the solvent exposure of Cov19T (purple) and Light chain (copper). SASA values stabilize at 140 nm for Cov19T and 13 nm² for Light chain, indicating minimal unfolding and consistent surface accessibility for immune recognition. These results collectively demonstrate the stability and interaction dynamics of the Cov19T- Light chain complex, supporting its potential for effective immune response elicitation.
In the radius of gyration plots for both the Cov19T-Heavy chain and Cov19T-Light chain complexes, Cov19T exhibited greater stability than the Heavy chain and Light chain (Fig 12C and 13C). Additionally, Fig 12D and Fig 13D show hydrogen bonds number in the Cov19T- Heavy chain complex was higher than in the Cov19T-Light chain complex.
The per-time specific solvent-accessible surface area (SASA) graph indicated that Cov19T maintained a more stable structure compared to the Cov19T-Heavy chain complex and Cov19T-Light chain complex (Fig 12E and 13E). These results suggest that Cov19B remained relatively stable during binding with the Heavy chain. Notably, the Cov19T- Heavy chain complex and Cov19T-Light chain complex show its stability after 10 ns, in comparison the Cov19B-Heavy chain complex and Cov19T-Light chain complex show its stability after 5 ns.
MM/PBSA calculations confirmed the binding affinities of all complexes. Table 8 displays the computed energy of the bonds formed in each complex between the vaccines and Abs. The overall binding energy of the Cov19B-Heavy chain complex was −35.77, indicating a high affinity between the two proteins. The Cov19B-Light chain complex showed binding energy of −39.24.
Table 8: MMP/BSA analysis data.
The Cov19T-Heavy chain complex exhibited a binding energy of −5.91. The presence of a salt bridge between the Cov19T and Heavy chain structures may explain the less negative binding energy observed in the Cov19T-Heavy chain complex. Meanwhile, the Cov19T-Light chain complex showed a binding energy of −19.26. These values suggest that the structures of Cov19B and Cov19T bind effectively to mAb chains.
Subsequent studies reveal that other energies have a significant impact on the complexes under study, in addition to the total binding energy. To be more precise, the electrostatic energy is recorded at −117.63 and the van der Waals energy contribution is −53.35 in the Cov19B-Heavy chain complex. These values highlight how the complex’s molecular forces interact intricately, providing insight into the complex dynamics that govern the complex’s stability and interactions (Fig 14A). In the Cov19B-Light chain complex, the van der Waals energy is −54.98, and the electrostatic energy is −159.08 (Fig 14B).
MMPBSA analysis of binding free energy contributions for the complexes, derived from 100 ns molecular dynamics simulations using GROMACS.The plot illustrates the decomposition of binding energy into individual components, including van der Waals (vdW), electrostatic, polar solvation, and non-polar solvation energies for (A) Cov19B-Heavy chain, (B) Cov19B-Light chain, (C) Cov19T-Heavy chain, and (D) Cov19T-Light chain complexes. The total binding free energy for the complex’s interaction is calculated, with significant contributions from van der Waals interactions and electrostatic interactions, partially offset by polar solvation energy. The non-polar solvation energy contributes favorably, enhancing the stability of the complex. These energy contributions highlight the strong affinity between complexes, supporting the potential of structures to elicit a robust immune response through stable receptor binding, a critical factor in its efficacy as a multi-epitope vaccine candidate against SARS-CoV-2 variants.
These values provide insight into the specific forces driving the interactions within the Cov19B-Heavy chain and Cov19B-Light chain complexes.
The electrostatic energy in the Cov19T-Heavy chain complex is −8.79 and the van der Waals energy is −24.98, as shown in Fig 14C. On the other hand, as Fig 1D illustrates, the van der Waals energy in the Cov19T-Light chain complex is −27.98, and the electrostatic energy is 57.17. All of these quantitative evaluations indicate that both complexes are stable, with particular attention paid to the strong stability shown in Cov19B-Heavy chain, Cov19B-Light chain, and Cov19T-Heavy chain complexes. These findings highlight the complex intermolecular force balance that controls the formation of complexes and offer valuable new insight into their structural integrity and possible functional implications.
Immune simulation
The C-IMMSIM immunological study provided an in-silico perspective on the vaccine’s effectiveness and yielded valuable insights into the immunogenic profile the vaccine elicited. Interestingly, compared to the secondary and tertiary immune responses, the primary immune response was shown to occur at substantially lower levels after stimulation. Based on these results, it was calculated that to adequately boost the immune system’s response, a vaccination program consisting of three doses spaced four weeks apart would be required. The resulting immune responses showed IgM and IgG levels (Fig 15A and 16A).
C-ImmSim simulation results of the immunological response induced by the multi-epitope vaccine construct Cov19-B, evaluated following a three-dose regimen (injections at 0, 4, and 8 weeks) to assess its efficacy against SARS-CoV-2.(A) Levels of immunoglobulins (IgM, IgG1, IgG2) over time, showing a significant increase post-vaccination, with peak IgG levels reaching after the third dose, indicating a robust humoral immune response. (B) Total B-cell isotype populations, demonstrating a shift towards IgG1 and IgG2 isotypes, reflecting effective class-switching and antibody production. (C) Population dynamics of CD4 T-helper cells, which peak, following the second dose, highlighting the vaccine’s ability to stimulate cellular immunity. (D) Memory and total T-helper cell counts, with memory T-helper cells increasing to by week 12, ensuring long-term immune protection. (E) Population of CD8 T-cytotoxic lymphocytes, showing a sustained increase, after the third dose, indicative of strong cytotoxic activity against infected cells. (F) Production levels of cytokines IL-4, IL-6, and IL-12 over time, with IL-12 peaking, promoting Th1 differentiation, while IL-4 and IL-6 support Th2 responses and B-cell activation. These results collectively demonstrate that Cov19-B elicits a balanced humoral and cellular immune response, supporting its potential as an effective vaccine candidate against SARS-CoV-2 variants.
C-ImmSim simulation results of the immunological response induced by the multi-epitope vaccine construct Cov19-T, evaluated following a three-dose regimen (injections at 0, 4, and 8 weeks) to assess its efficacy against SARS-CoV-2.(A) Levels of immunoglobulins (IgM, IgG1, IgG2) over time, showing a significant increase post-vaccination, with peak IgG levels reaching after the third dose, indicating a robust humoral immune response. (B) Total B-cell isotype populations, demonstrating a shift towards IgG1 and IgG2 isotypes, reflecting effective class-switching and antibody production. (C) Population dynamics of CD4 T-helper cells, which peak, following the second dose, highlighting the vaccine’s ability to stimulate cellular immunity. (D) Memory and total T-helper cell counts, with memory T-helper cells increasing to by week 12, ensuring long-term immune protection. (E) Population of CD8 T-cytotoxic lymphocytes, showing a sustained increase, after the third dose, indicative of strong cytotoxic activity against infected cells. (F) Production levels of cytokines IL-4, IL-6, and IL-12 over time, with IL-12 peaking, promoting Th1 differentiation, while IL-4 and IL-6 support Th2 responses and B-cell activation. These results collectively demonstrate that Cov19-T elicits a balanced humoral and cellular immune response, supporting its potential as an effective vaccine candidate against SARS-CoV-2 variants.
One noteworthy finding in the tertiary response was the appearance of memory cells, which are indicative of a critical component of long-term immune defense. As seen in Figs 15B and 16B, this phase was marked by a robust activation of B cells, including B isotypes of IgM and IgG. At the same time, as Figs 15C and 16C show, there was a notable increase in the activity of CD4 T-helper cells and CD8 cytotoxic T lymphocytes, which are associated with secondary and tertiary responses. Nevertheless, it is significant that this increased activity eventually decreased. Furthermore, there was an association between the cytotoxic T cell (TC) population and the pre-activation of all TCs, suggesting that the immune system was responding in a coordinated and anticipatory manner (Fig 15D and 16D).
The final results of the investigation showed increased concentrations of key cytokines as well as significant effects on populations of CD8 T-cytotoxic lymphocytes after the injections. This finding highlights the vaccine’s ability to effectively activate the immune system, as Figs 15E, 16E, 15F, and 16F illustrate. All of these results point to the vaccine’s potential efficacy in stimulating a robust and diverse immune response. Ultimately, this analysis’s in-silico immune response predictions offer valuable insights into the vaccine’s anticipated effectiveness, contributing to well-informed decisions on its distribution and refinement.
Discussion
Numerous studies on SARS-CoV-2 provide a wealth of information, including the origin of the virus (bats), epidemiology, clinical features, and treatment strategies including antiviral drugs (remdesivir and nimratrelivir/ritonavir), monoclonal antibodies, and vaccines (BNT162b2 and mRNA-1273). It also addresses the evolution of the virus, mutations, and challenges associated with vaccine and therapeutic efficacy, emphasizing the importance of vaccination and public health measures [45,48,56–62].
SARS-CoV-2, as a global challenge, requires multifaceted approaches including widespread vaccination, development of effective treatments, and international cooperation. Despite significant advances in vaccine and antiviral drug development, ongoing virus mutations and declining vaccine efficacy over time highlight the need for continued surveillance, enhanced vaccine doses, and implementation of public health interventions. These findings underscore the need for adaptive strategies to control the pandemic and mitigate its impacts [63–74].
The rapid evolution of SARS-CoV-2 variants, such as Delta and Omicron, poses a significant challenge due to immune evasion and reduced efficacy of existing vaccines and therapeutics [42]. The immunopathogenesis of COVID-19, driven by dysregulated TLR-mediated cytokine storms and impaired adaptive immunity, underscores the need for vaccines that elicit robust, broad-spectrum protection [38,40]. This study addresses this problem by designing two multi-epitope vaccine constructs, Cov19-B and Cov19-T, targeting conserved epitopes across SARS-CoV-2 proteins to ensure coverage against variants. The necessity of this approach is evident given the limitations of current therapeutics, such as antivirals and monoclonal antibodies, which struggle against emerging variants [42].
Immunoinformatics enabled precise epitope selection using IEDB, NetCTL, Propred, and ABCpred, ensuring antigenicity, safety, and conservation [43,75]. The use of β-defensin and PADRE adjuvants, EAAAK and GPGPG linkers, and disulfide engineering enhanced immunogenicity and stability, as validated by molecular docking, dynamics simulations, and C-ImmSim immune simulations predicting strong IgG, IFN-γ, and IL-2 responses [44]. Immunoinformatics has proven instrumental in vaccine design against diverse pathogens, including viruses, bacteria, and helminths like Schistosoma mansoni and Fasciola hepatica. By identifying conserved epitopes and simulating immune responses, it offers a rapid, cost-effective approach to combat complex parasitic lifecycles and antigenic variability [76–80].
While this study is limited to in silico predictions, the methodologies align with validated approaches for SARS-CoV-2 and other pathogens [37]. Future experimental validation, such as in vitro binding assays or animal models, is needed to confirm efficacy, as demonstrated in similar studies [46]. Cov19-B and Cov19-T represent promising candidates for broad-spectrum SARS-CoV-2 vaccination, with potential adaptability to co-infections in endemic regions [43].
After the discovery of SARS-CoV-2 and the declaration of COVID-19 as a pandemic by the World Health Organization, scientists initiated efforts to develop vaccines aimed at mitigating the impact and transmission of the disease [81]. Vaccines that are live or attenuated have a long history of efficacy, but they can trigger autoimmune and allergic reactions. As a result, immunoinformatics technologies are now being used to eliminate biosafety concerns while also saving time and money. Multi-epitope vaccines have the potential to overcome the limitations of traditional vaccines. One of the most notable differences between multi-epitope vaccines and traditional vaccines is that personalized immune responses are elicited following proper identification of immunogenic epitopes, enabled by viral genome screening. Several SARS-CoV-2 vaccines have been developed and reported to date [33,82–85].
The gradual emergence of different variants reflects the evolutionary process of mutation, selection, and adaptation in the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Modifications in the amino acid composition of the spike protein lead to changes in the virus’s antigenic properties, transmissibility, and pathogenicity [86].
In general, the occurrence of mutations in the structure of SARS-CoV-2 enables it to escape
the immune system’s defenses or evade immune responses [87]. These mutations, in turn, can lead to the emergence of new variants that give SARS-CoV-2 the ability to bypass immune defense mechanisms and ultimately reduce the effectiveness of vaccines [88,89]. Therefore, the study, identification and application of these mutations can lead to the development of mutation-proof vaccines to combat COVID-19, which may be capable of acting against a wider range of COVID-19 variants [87].
The current study was carried out with the aim of addressing the lack of identified mutations in the spike structure and incorporating alternative amino acids in the epitopes bearing these mutations. It is expected that the mutations used in the structures will facilitate the design and prediction of a mutation-proof vaccine against SARS-COV-2.
The S-protein of these viral strains has undergone many mutations, which have rendered vaccines ineffective [89]. The frequent instances of recombination pose a challenge in predicting the efficacy of vaccines targeting the spike protein. Recombination has the potential to confer modified transmissibility, virulence, and immune escape properties on the virus [90,91]. Evidence has shown that in the S protein, the highest mutation frequency was observed in amino acids (aa) 508–635 (0.77%) and aa 381–508 (0.43%) [92].
One report has demonstrated the dominance of the SARS-CoV-2 variant with the D614G Spike mutation [93]. The D614 variant was prevalent in epidemics when G614 began to appear, and the G614 variant exhibits a higher titer as pseudotyped virions. The D614G mutation is part of a conserved haplotype with four mutations that consistently occur together. The increasing frequency of G614 in populations co-circulating with D614 suggests positive selection. Additionally, G614 is associated with higher viral nucleic acid levels in the upper respiratory tract and increased infectivity in pseudotyping assays [93]. Furthermore, the G614 variant in the Spike protein has spread more rapidly than the D614 variant globally, suggesting increased infectivity. This finding aligns with in vitro observations of higher infectivity with G614 Spike-pseudotyped viruses and the G614 variant’s association with lower patient Ct values, indicating potentially higher in vivo viral loads [93].
The SARS-CoV-2 variant carrying the G614 mutation has become the predominant circulating variant, replacing the D614 variant. The D614G substitution disrupts a hydrogen bond interaction, leading to the destabilization of the spike trimer and an increased interaction of the receptor binding domain (RBD) with the angiotensin-converting enzyme 2 (ACE2). This mutation, by elevating the viral load in the upper respiratory tract of COVID-19 patients, is suggested to enhance the transmission of SARS-CoV-2 [94,95].
Nevertheless, in a report of D614G, Epsilon, Alpha, Beta, and Gamma mutations found in pan-sarbecovirus vaccine candidate (DCFHP-alum) were associated with a decreased efficacy of DCFHP-alum [96].
Notably, it has been shown that certain substitutions in the spike protein, such as Q493R and Q498R, were consistent with the B.1.1.529 (Omicron) variant, while others like Y144del and H655Y were also identified in B.1.1.7 (Alpha), P.1 (Gamma), and B.1.1.529 (Omicron) variants [97–100]. On the other hand, it has been noted that the similarly located alteration P681R increases the Delta variant’s pathogenicity and ability to replicate. The virus’s antigenic epitope is gradually modified by these mutations as they accumulate one after the other. As a result of this process, “genetic drift” gives way to “antigenic drift “ [101–103].
There is even a report that shows that three key amino acid mutations in the S protein, including A605E, E633Q, and R891G, increase the infection efficiency of recombinant viruses in the Coronavirus porcine epidemic diarrhea virus (PEDV) to Infect Vero cells [104].
In the case of the Beta variant (B.1.351), alterations such as mutations at positions 242–244, K417N, E484K, and N501Y have demonstrated notable resistance against infection or vaccine-induced neutralizing antibodies [105–109]. The concurrent presence of E484K and N501Y mutations synergistically boosts the spike protein’s affinity for the human ACE2 receptors [110,111].
The Delta variant, also known as B.1.617.2, harbors multiple mutations that have been observed in other Variants of Concern. These mutations, including L452R, T478K, E484Q, D614G, and P681R, are all situated within the spike protein [103]. The L452R mutation has been linked to higher viral fusogenicity, decreased sensitivity to neutralizing antibodies, and increased infectivity. All of these factors work together to increase the virus’s reproductive rate [112].
The development of SARS-CoV-2 prior to the Omicron variants was mainly driven by the accumulation of recurrent mutations in important spike protein residues, including D614, N501, P681, K417, and E484. However, this evolutionary trajectory underwent a major change with the emergence of the Omicron variant and associated sublineages [86].
The sublineage BA.4.6, which descended from the BA.4 variant, is unique in that it carries two additional mutations located in the spike protein. These mutations, designated R346T and N658S, are changes that set BA.4.6 apart from its progenitor strain, BA.4 [113,114]. Also, the BF.7 variants, derived from BA.5, include an additional R346T mutation. This specific mutation has been linked to an increased capability of the virus to evade neutralizing antibodies produced by vaccines or prior infections [115–117]. Ma et al. (2023) reported an increase in Sarbecovirus fusogenicity by enhancing the usage of TMPRSS2 through the Spike substitution T813S [118].
Similar to our study, Bhattacharya et al. (2023) used the predicted mutations of Gly339Asp, Asn501Tyr, Ser477Asn, Thr478Lys, Tyr505His on SARS-CO2 spike to build the final structure of their proposed vaccine [89]. In addition, Zhang et al. (2022), used mutations such as Lys417Asn (Variants Of Concerns or VOCs in Delta), T478K (VOCs in Delta, Omicron), N501Y (VOCs in Alpha, Beta, Gamma, Omicron), E484K (VOCs in Beta, Gamma), K417T (VOCs in Gamma), S477N (VOCs in Omicron), K417N (VOCs in Beta, Omicron) and F490S (VOCs in Lambda) with the aim of fighting against different SARS-COV-2 variants in their construct [119].
Although the mutation data in this study were collected up to January 2022, they represent a substantial and diverse set of SARS-CoV-2 sequences including several high-frequency mutations that persist in later variants. Importantly, our methodology is adaptable and can be readily updated to include novel mutations as new sequence data emerge. Future iterations of the vaccine constructs can thus integrate real-time mutational trends, ensuring continued relevance against evolving strains
A variety of spike mutations and amino acid substitutions are listed in Table 9, where all mutations on epitopes containing amino acid substitutions with high antigenicity are mentioned and confirmed with other studies.
Table 9: Uniqueness of the selected epitopes carrying notable mutation on spike SARS-COV-2 to worldwide.
We designed two constructs capable of triggering both humoral and cellular immune responses a combination of computational tools and immunoinformatics methodologies in our research. Here, we harnessed a multi-epitope vaccination against the Spike protein, as well as mutations in the variants of Spike protein, in this study [85,153].
As an adjuvant, we employed beta-defensin 3 sequences. The C-terminal region of Cov19B has been found to behave as a possible adjuvant when combined with beta-defensin 3. We also inserted the PADRE sequence, which serves as a peptide with excellent adjuvant characteristics, into construct.
Additionally, we used the Large ribosomal subunit protein bL12, 50S ribosomal protein L7/L12 as a potent protein adjuvant for Cov19T structure. The GPGPG linkers allowing neighboring domains to operate more freely; it also has immune-modulatory properties [154]. Antigen processing was also been aided by the KK linker.
The selection of β-defensin 3 and PADRE as adjuvants for the Cov19-B and Cov19-T constructs, respectively, was driven by their complementary immunological roles. β-defensin 3, a natural antimicrobial peptide, enhances innate immunity by activating TLR4 on antigen-presenting cells, thereby promoting B-cell responses critical for neutralizing SARS-CoV-2 [45]. HBD-3 activates innate immunity by enhancing antigen uptake and APC maturation. Its known antimicrobial and immunomodulatory properties make it a valuable front-line enhancer of host defense.
Conversely, PADRE, a synthetic universal T-helper epitope, ensures robust CD4 + T-cell activation across diverse MHC class II alleles, enhancing cytotoxic T-cell responses against viral variants. PADRE provides a broad T-helper response across diverse HLA backgrounds, improving CD4 + T cell recruitment regardless of host genotype [57]. The EAAAK and GPGPG linkers were chosen to optimize the structural and functional integrity of the vaccine constructs. EAAAK provides rigidity and separates functional domains (adjuvants from epitope blocks) to prevent unfavorable interactions. EAAAK provides rigidity and separates functional domains (adjuvants from epitope blocks) to prevent unfavorable interactions. The rigid EAAAK linker prevents domain interference, while the flexible GPGPG linker facilitates epitope presentation to MHC molecules, as demonstrated in prior multi-epitope vaccine studies. GPGPG is a neutral, flexible linker ideal for MHC-II epitope separation, preserving their structure and immunogenicity [48]. KK enhances solubility and introduces structural flexibility, improving protein folding and expression. AAY facilitates proteasomal cleavage for effective MHC-I presentation. These choices align with the goal of designing a safe, immunogenic, and broadly protective vaccine against SARS-CoV-2 variants. Linkers were carefully chosen to preserve the immunological function of each epitope and reduce potential interference. This rational integration of adjuvants and linkers provides a structurally robust and immunologically potent construct that aims to elicit comprehensive immune protection.
The computed index identified specific epitopes that frequently bind to HLA molecules in target populations, providing coverage for more than 95% of the population’s cytotoxic T lymphocytes (CTLs) and helper T lymphocytes (HTLs) across 64 different geographic areas. A mix of linear and conformational B cell epitopes was added to the construct so that this specific epitope only elicits an immune response in those whose expressed MHC can attach to it. Peptides linked to cytotoxic T lymphocyte (CTL) responses require HLA I molecules, whereas peptides linked to helper T lymphocyte (HTL) responses require HLA II molecules. Thus, different HLAs can bind to the CLT and HTL epitopes.
MHS coverage is determined by calculating the percentage of HLA classes. Achieving a broader range of HLA molecule binding for each epitope contributes to significant population coverage among related individuals [155].
Seventy-three CTL epitopes and forty HTL epitopes were discovered from the spike, demonstrating the construct’s ability to produce high cellular immunity. We employed the AllerCatPro server to forecast probable allergenicity while paying attention to the significance of non-allergenicity for the immune system, which is often one of the critical inherent limitations in the vaccine design phase. The findings indicated that none of the epitopes tested were immune system allergens. Furthermore, toxicity testing of selected epitopes revealed that each epitope was expected to be non-toxic.
To determine the third protein’s structure, the I-TASSER server was employed, followed by refinement using GalaxyRefine. The structures showed improvement post-refinement, which is evident in comparing Z-scores before and after the refining process. Subsequently, the 3D-modeled construct’s concerning the amino acid sequence position revealed a high proportion of residual negative energy and a low proportion. These findings indicate potential defects or inaccuracies, which contributed minimally to the structural energy plot.
Validation related to homology modeling association suggests that most of the structure possesses favorable energy levels, contributing to overall stability. In contrast regions with higher energy are relatively limited in the construct, indicating the structure’s low energy and thus its strength in terms of energetic content. Proteins with stable structures can better maintain connections with other proteins and ligands. Furthermore, following refining, the Ramachandran graph revealed that 85% (Cov19B) and 72% (Cov19T) of amino acids have appropriate locations.
Pattern recognition receptors (PRR) recognize pathogen-associated molecular patterns (PAMPs) and activate the responsive innate host antiviral defense, which aids in the regulation of viral infection [156]. Since COVID-19 is a relatively new disease with limited clinical evidence, research has revealed that immune stimulation is essential in COVID-19 vaccinations.
The interaction of the recombinant protein of this vaccine with mAb was investigated using protein–protein molecular docking. The interactions generated in both CoV19B-Heavy chain, CoV19B-Light chain, CoV19T-Heavy chain, and CoV19T-Light chain complexes were explained by hydrogen bonds in CoV19B-Heavy/Light chain and CoV19-Heavy/Light chain complexes. The interaction of complexes showed a high affinity for them. Other binding energies indicate a strong affinity for the complexes [157].
Here, free energy calculations and molecular dynamics simulations were used to assess and simulate the vaccine model’s docking with monoclonal antibodies. Research has suggested a connection between heavy chain triggering and light chain signaling [158]. It can then activate immunological pathways implicated in pathogenesis and viral infection as a result [159–161]. Consequently, we conducted docking of the vaccine construct with the monoclonal antibody (mAb), and the MD complex data in the CoV-mAb system indicated appropriate interaction between the heavy chain/light chain and the CoV19B/T.
Studies show that neutralizing plasmas from donors who have recovered from COVID-19 exhibit a variety of antibody responses [162,163]. In one report, it has been examined polyclonal plasma IgGs, which exhibited different levels of cross-reactivity against the RBDs of the common cold virus and non-cross-reactive antibodies against S proteins from SARS-CoV-2, SARS-CoV, and MERS-CoV. Convalescent plasma IgG epitope mapping revealed an S1A epitope outside the RBD, which may serve as an alternate binding site for neutralizing antibodies, in addition to the expected targeting of the S protein RBD [162]. Also, examination of a neutralizing monoclonal antibody (mAb) from different patients revealed the presence of an anti-SARS-CoV-2 antibody that blocks the ACE2 receptor [19,162].
The idea that recurrent groups of anti-SARS-CoV-2 neutralizing antibodies from the VH gene segments use different RBD-binding mechanisms, as observed in several mAbs, respectively, is supported by structural findings in other research. This data is essential for analyzing the sequences of antibodies produced by vaccination or infection. Notably, frequent changes in different SARS-CoV-2 isolates seem unlikely to disrupt the identified RBD and S1A epitopes, providing hope for the efficacy of vaccines against the ongoing epidemic.
Research has shown that some antibodies potently neutralize the virus, correlating with their ability to compete with ACE2 for RBD binding. These antibodies and infected plasma did not cross-react with RBDs of SARS-CoV or MERS-CoV, although cross-reactivity was observed with their trimeric spike proteins. Crystal structure analysis has revealed that steric hindrance inhibiting viral engagement with ACE2, suggesting that anti-RBD antibodies are mainly viral-species-specific inhibitors [164].
In molecular dynamic simulations lasting 100 nanoseconds, the RMSD and RMSF diagrams exhibited minimal oscillations and fluctuations within the complexes, affirming the robust stability and flexibility of the vaccine structure. The RMSD, RMSF, and radius of gyration diagrams all indicated that the number of hydrogen bonds engaged in the complexes was maintained, which was supported by the amount of H-bonds.
Conclusion
The simultaneous presence of various SARS-CoV-2 variants and co-infections in immunocompromised individuals accelerates the generation of recombinant forms. Monitoring genomic variations in SARS-CoV-2, with a focus on mutations in the spike protein and instances of recombination, is crucial. This surveillance allows for the ongoing detection of shifts in both the viral genome and antigenic epitopes, fostering the development of pioneering vaccination methods and treatment modalities. Here, we introduced notable mutations in the spike protein of the coronavirus and applied them to pan-sarbecovirus constructs against the spike of the coronavirus. The results show that despite the amino acid substitutions, there is a proper connection between the vaccine structure and the heavy and light chains of the human antibody. Based on the results and analysis of this study, these structures probably have the ability to elicit an appropriate immune response in the body, and the results show that these structures deserve further investigation in in-vitro and in-vivo experiments.
Supporting information
S1 TablePopulation coverage calculation.(PDF)
S2 TablePredicted discontinuous epitope(s) of Cov19B.(PDF)
S3 TablePredicted discontinuous epitope of Cov19T.(PDF)
S4 TableSummary of the top 10 models for Cov19B-Heavy chain Table.(PDF)
S5 TableSummary of the top 10 models for Cov19B-Light chain Table.(PDF)
S6 TableSummary of the top 10 models for Cov19T-Heavy chain.(PDF)
S7 TableSummary of the top 10 models for Cov19T-Light chain.(PDF)
S1 FigResults of Engineering disulfide for Cov19-B 3D structure.(PDF)
S2 FigResults of Engineering disulfide for Cov19-T 3D structure.(PDF)
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Hui DS, I Azhar E, Madani TA, Ntoumi F, Kock R, Dar O, et al. The continuing 2019-n Co V epidemic threat of novel coronaviruses to global health - The latest 2019 novel coronavirus outbreak in Wuhan, China. Int J Infect Dis. 2020;91:264–6. doi: 10.1016/j.ijid.2020.01.009 31953166 PMC 7128332 · doi ↗ · pubmed ↗
- 2Choudhary S, Malik YS, Tomar S. Identification of SARS-Co V-2 Cell Entry Inhibitors by Drug Repurposing Using in silico Structure-Based Virtual Screening Approach. Front Immunol. 2020;11:1664. doi: 10.3389/fimmu.2020.01664 32754161 PMC 7365927 · doi ↗ · pubmed ↗
- 3Lu R, Zhao X, Li J, Niu P, Yang B, Wu H, et al. Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding. Lancet. 2020;395(10224):565–74. doi: 10.1016/S 0140-6736(20)30251-8 32007145 PMC 7159086 · doi ↗ · pubmed ↗
- 4Perlman S, Netland J. Coronaviruses post-SARS: update on replication and pathogenesis. Nat Rev Microbiol. 2009;7(6):439–50. doi: 10.1038/nrmicro 2147 19430490 PMC 2830095 · doi ↗ · pubmed ↗
- 5Zhu Z, Lian X, Su X, Wu W, Marraro GA, Zeng Y. From SARS and MERS to COVID-19: a brief summary and comparison of severe acute respiratory infections caused by three highly pathogenic human coronaviruses. Respir Res. 2020;21(1):224. doi: 10.1186/s 12931-020-01479-w 32854739 PMC 7450684 · doi ↗ · pubmed ↗
- 6Peeri NC, Shrestha N, Rahman MS, Zaki R, Tan Z, Bibi S, et al. The SARS, MERS and novel coronavirus (COVID-19) epidemics, the newest and biggest global health threats: what lessons have we learned?. Int J Epidemiol. 2020;49(3):717–26. doi: 10.1093/ije/dyaa 033 32086938 PMC 7197734 · doi ↗ · pubmed ↗
- 7Su S, Wong G, Shi W, Liu J, Lai ACK, Zhou J, et al. Epidemiology, Genetic Recombination, and Pathogenesis of Coronaviruses. Trends Microbiol. 2016;24(6):490–502. doi: 10.1016/j.tim.2016.03.003 27012512 PMC 7125511 · doi ↗ · pubmed ↗
- 8Anand K, Ziebuhr J, Wadhwani P, Mesters JR, Hilgenfeld R. Coronavirus main proteinase (3C Lpro) structure: basis for design of anti-SARS drugs. Science. 2003;300(5626):1763–7. doi: 10.1126/science.1085658 12746549 · doi ↗ · pubmed ↗
