The neutralizing antibody titer correlate of COVID-19 risk in the COVID-19 variant immunologic landscape (COVAIL) trial was not modified by SARS-CoV-2 amino acid sequence distances

Fei Heng; Craig A. Magaret; Nadine G. Rouphael; Angela R. Branche; Youyi Fong; Lindsay N. Carpp; Chenchen Yu; Shiyu Chen; Bo Zhang; David J. Diemert; Ann R. Falsey; Daniel S. Graciaa; Lindsey R. Baden; Sharon E. Frey; Jennifer A. Whitaker; Susan J. Little; Satoshi Kamidani; Emmanuel B. Walter; Richard M. Novak; Richard Rupp; Lisa A. Jackson; Tara M. Babu; Angelica C. Kottkamp; Anne F. Luetkemeyer; Lilly C. Immergluck; Rachel M. Presti; Martín Bäcker; Patricia L. Winokur; Siham M. Mahgoub; Paul A. Goepfert; Dahlene N. Fusco; Robert L. Atmar; Christine M. Posavad; Jinjian Mu; Mat Makowski; Mamodikoe K. Makhene; Seema U. Nayak; Viviana Simon; Harm van Bakel; Paul C. Roberts; Peter B. Gilbert

PMC · DOI:10.1016/j.vaccine.2026.128348·March 23, 2026

The neutralizing antibody titer correlate of COVID-19 risk in the COVID-19 variant immunologic landscape (COVAIL) trial was not modified by SARS-CoV-2 amino acid sequence distances

Fei Heng, Craig A. Magaret, Nadine G. Rouphael, Angela R. Branche, Youyi Fong, Lindsay N. Carpp, Chenchen Yu, Shiyu Chen, Bo Zhang, David J. Diemert, Ann R. Falsey, Daniel S. Graciaa, Lindsey R. Baden, Sharon E. Frey, Jennifer A. Whitaker, Susan J. Little, Satoshi Kamidani

PDF

Open Access

TL;DR

A study found that the level of neutralizing antibodies after a booster shot was linked to lower risk of future COVID-19, regardless of how much the virus had mutated.

Contribution

The study shows that the antibody titer correlate of risk was not modified by SARS-CoV-2 amino acid sequence distances from the vaccine insert.

Findings

01

Neutralizing antibody titer after a booster was significantly associated with lower future COVID-19 incidence.

02

The association between antibody titer and risk did not vary with sequence distances from the vaccine insert.

03

Statistical power was limited due to low antigenic variability among circulating viruses during the trial.

Abstract

In the Coronavirus Variant Immunologic Landscape Trial (COVAIL) conducted in the United States in 2022–2023, 985 participants received a second COVID-19 booster with one of twelve monovalent or bivalent mRNA inserts. Pseudovirus serum inhibitory dilution 50% neutralizing antibody titer (nAb titer) measured two-weeks post booster significantly associated with lower COVID-19 incidence over six months follow-up in this trial. COVAIL investigators sequenced SARS-CoV-2 Spike amino acid sequences for all COVID-19 cases, with a sequence successfully obtained from 129 of 195 cases. For COVID-19 endpoint cases we calculated five distances of the case-causing sequence to a reference sequence, the first two physico-chemical weighted Hamming distances of Spike or receptor binding domain (RBD) to a participant’s nearest Spike or RBD vaccine-insert sequence, and the other three estimated degrees of…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Proteins2

Species3

Severe acute respiratory syndrome coronavirus 2(no rank)Gammacoronavirus(genus)Homo sapiens(human · species)

Diseases3

COVID-19 XBB.1.5 infected

Figures5

Click any figure to enlarge with its caption.

E](#F1)–[1G](#F1))]. All but two of the 101 Omicron Vaccine Group COVID-19 endpoint cases had DMS-escape RBD-3 distance equal to 0, of which 99 were infected with the wildtype (no-escape) XBB.1.5 sequence ARFFDYKGNLFYAG at the 14 high-escape positions, 1 with ARFFDYKGDLFYAG that had XBB.1.5 antibody escape 2.84, and 1 with ARFFDYKGNLLYAG (escape score 4.21) ([Table 4](#T4)). Therefore, almost no circulating strains that caused COVID-19 infections had any of the 14 high-ranked antibody escape mutations from XBB.1.5 while also mismatching the Ancestral strain and the participant’s vaccine sequen

C) compared to 0.16 in Naïve and Non-naïve participants pooled ([Fig. 4C](#F4)).

a: 352, 357, 371, 375, 420, 421, 440, 447, 450, 455, 456, 473, 475, 485, which we refer to as the DMS-escape RBD-3 set of sites. Moreover, for ‘middle ground’ scores between inclusion of all RBD amino acid sites with escape score values (167 sites) and the 14 DMS-escape RBD-3 sites, we plotted the position-specific escape scores across all amino acid sites and sorted them, and searched for a cut-off that majorly reduced the total number of sites but not all the way down to the 14 DMS-escape RBD-3 sites, essentially looking for a kink in the convex hull of the plot. An escape-score cut-off of 0

C](#F5)) but not for Spike Hamming distance ([Fig. 5A](#F5), [B](#F5)) or DMS-escape RBD-1 ([Fig. 5D](#F5)).

Keywords

Deep mutational scanningImmune correlate of protectionmRNA vaccineNeutralizing antibody escapeRandomized clinical trialRecombinant protein vaccine

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSARS-CoV-2 and COVID-19 Research · vaccines and immunoinformatics approaches · COVID-19 Clinical Research Studies

Full text

Introduction

COVID-19 vaccines based on the ancestral (GenBank accession no. MN908947) strain have been widely phased out and replaced by ones adapted [1–4] to counteract decreased vaccine effectiveness [5–10] against COVID-19 caused by emerging SARS-CoV-2 variants [11,12]. Booster vaccine doses [Prototype or variant-adapted] were also introduced, demonstrating efficacy especially over short-term follow up [13–22]. The FDA recommended a BA.4/BA.5 strain booster in fall of 2022 and an updated XBB.1.5 strain booster in fall of 2023. In subsequent updates, strain selection has continued to evolve based on global surveillance, with the 2024–2025 season incorporating variants from the XBB lineage and more recently, JN.1 sublineages.

In this variant boost era, the randomized, open-label COVID-19 Variant Immunologic Landscape (COVAIL) vaccine trial was conducted to assess the safety and immunogenicity of variant COVID-19 boosters. These boosters elicited cross-reactive neutralizing antibodies against multiple variants — D614G, Beta, Delta, BA.1, and BA.4/BA.5 — but with antibody waning over three months and markedly reduced titers against newer Omicron subvariants (e.g., BQ.1.1 and XBB.1) [23,24].

Neutralizing antibody titer to the ancestral or D614G strain was established as a correlate of protection (CoP) [25,26] against COVID-19 caused by pre-Delta strains [27] and has been used as a surrogate endpoint to guide regulatory decisions [28–32]. Two analyses of COVAIL data investigated performance of pseudovirus inhibitory serum dilution 50% neutralizing antibody titer (nAb titer) as an immune correlate in the Omicron-variant era [33,34]. Fong et al., focusing on Stage 3 with recombinant protein vaccines, reported that peak nAb titer [Day 15 (D15) post-second boost] was an inverse correlate of risk (CoR) of BA.4/BA.5 COVID-19 over ~6-months post-D15 [33]. Although BA.4/BA.5 dominated during follow-up, the results did not support that matching the variant titer to the circulating strain was needed to preserve the correlate [33]. Zhang et al. reported similar results for recipients of a one-dose mRNA second vaccine boost in Stages 1, 2, and 4, where again matching the variant titer to BA.4/BA.5 did not yield a stronger CoR of BA.4/BA.5 COVID-19 as compared to the D614G titer [34], although it did improve statistical precision.

The Fong et al. and Zhang et al. analyses investigated the nAb titer as a correlate of overall or lineage-specific COVID-19. Additional insights into immune correlates can be potentially gained by accounting for pathogen sequence features beyond lineage. For example, based on pathogen sequences obtained from participants experiencing a given infectious disease endpoint, sieve analyses [35] assess how vaccine efficacy varies with pathogen sequence features, as was done for the Ad26. COV2⋅S [36] and mRNA-1273 [37] COVID-19 vaccines and for other vaccines and pathogens (e.g., refs. [38–41]). In a further step, pathogen sequence data may be incorporated into immune correlates analyses, for example by assessing nAb titer as a correlate of a disease endpoint specific to a particular genotypic or immunotypic distance of the pathogen to a vaccine-insert sequence. Sun et al. performed such an analysis of data from a randomized, placebo-controlled HIV-1 vaccine efficacy trial, finding that the inverse correlation of homologous anti-Env V2 IgG binding antibody concentration with acquisition of HIV-1 infection was stronger for viruses with shorter V2 amino acid sequence distances to the vaccine strains [42]. Qi et al. performed a similar analysis of a tetravalent dengue vaccine efficacy trial, finding that the inverse correlation of homologous average nAb titer (against the four vaccine strains) with symptomatic, virologically-confirmed dengue was stronger for dengue viruses with shorter amino acid sequence distances (defined at nAb contact sites) to the vaccine strains [43]. Such host-pathogen integrated analyses can demonstrate immunological relevance of sequence distance and may inform development of surrogate endpoints. The use of such integrative methods in COVID-19 has been limited to date, and their potential to inform immunobridging strategies [28,32] for approving refined or new vaccines remains underutilized. This study offers a critical opportunity to assess their applicability under real-world variant dynamics.

The question we address is whether the nAb titer correlate of risk becomes weaker against sequence-specific COVID-19 with greater distance from the vaccine-insert sequence or with greater estimated antibody escape score distance from XBB.1.5 based on deep mutational scanning experiments of XBB.1.5 using sera from XBB.1.5-infected individuals. If the answer is affirmative for a given sequence distance, then it suggests that the distance is immunologically relevant and adds information to the nAb titer immune correlate beyond the measurement of the titer alone. Our objective is to assess D15 nAb titer against the vaccine strain as a correlate of risk (CoR) of sequence-distance-specific COVID-19 over 6 months of follow-up post booster, the same period of follow-up studied previously [33,34].

Methods

Background on COVAIL

2.1.

COVAIL was conducted in four sequential stages: Stage 1 began in March 2022 and Stage 4 in October 2022. The study enrolled adults in the United States stratified by age (18–64, 65+) and prior SARS-CoV-2 infection history, deemed to be in stable health, and previously vaccinated as defined by confirmed receipt of a complete primary and booster COVID-19 vaccine series, either homologous or heterologous, with an FDA authorized/approved vaccine at least 16 weeks prior to study vaccine dose 1 [23,24]. Participants were randomized to one of a stage-specific selection of monovalent or bivalent COVID-19 vaccine second boosters [mRNA or adjuvanted recombinant protein; Prototype, Beta (B.1.351), Delta (B.1.617.2), Omicron BA.1 (B.1.1.529.1), or Omicron BA.4/BA.5 (B.1.1.529.4/B.1.1.529.5) Spike inserts] [23,24]. See [23] for complete inclusion/exclusion criteria.

Two vaccine groups for analyses

2.2.

We assessed our objective for each of two groups of one-dose mRNA vaccine arms: vaccines with an Omicron insert (monovalent or bivalent), and vaccines with a monovalent Prototype insert, referred to as the Omicron Vaccine Group and the Prototype Vaccine Group. The former group comprised arms 2, 4–6, 8–9, 12, 16, 17 whereas the latter group comprised arms 1 and 7 (Table 1 and refs. [23, 34]). The deep mutational scanning distances were studied for the Omicron Vaccine Group alone given that all infecting viruses were with Omicron strains and the deep mutational scanning experiments measured antibody escape from the XBB.1.5 Omicron reference strain [44].

Definitions of naïve and non-naive

2.3.

As in Zhang et al. [34], SARS-CoV-2 non-naïve (“Non-naïve”) was defined as having a self-reported history of prior SARS-CoV-2 infection or having detectable anti-N antibodies [defined as Elecsys Anti-SARS-CoV-2 assay (Roche) cutoff index ≥1.0] at D1. Otherwise, a participant was considered SARS-CoV-2 naïve (“Naïve”).

Definition of COVID-19 endpoint and time frame for occurrence

2.4.

As in [34], the COVID-19 endpoint was a self-reported positive SARS-CoV-2 test (RT-PCR or antigen test) or study-conducted positive SARS-CoV-2 test (nasal swab and subsequent nucleic acid amplification test at an unscheduled illness visit) with onset date the earliest positive test date. COVID-19 endpoints for correlates assessment occurred in the time period 7 to 188 days post-D15, with evaluable cases defined as COVID-19 endpoints in participants in the correlates analysis per-protocol cohort (defined in [34]) during this time period. Non-cases were participants in the correlates analysis per-protocol cohort with no evidence of SARS-CoV-2 infection during 7 to 188 days post-D15.

Viral genome sequencing

2.5.

Viral RNA was isolated from nasopharyngeal swabs suspended in PBS or viral transport medium (VTM) using the Chemagic^™^ Viral DNA/RNA 300 Kit H96 (PerkinElmer, CMG-1033-S) on a Chemagic^™^ 360 automated extraction platform, following the manufacturer’s instructions. SARS-CoV-2 RNA was quantified by real-time RT-PCR targeting the N1 region using the CDC/NCIRD/DVD 2019-nCoV assay (IDT 2019-nCoV RUO Kit, 10006713). Specimens with cycle threshold (Ct) values ≤32 were selected for downstream sequencing analyses. Complementary DNA synthesis and whole-genome amplification were carried out using two custom primer pools designed to generate overlapping 1.5-kb and 2-kb amplicons spanning the viral genome, as previously reported [45]. Paired-end (2×150bp) Nextera XT libraries (Illumina, cat. FC-131–1096) were prepared from amplicons and sequenced on a MiSeq instrument. Consensus genomes were generated using the Virus Reference-based Assembly Pipeline and IDentification (vRAPID) package [46]. Genomes with >95% coverage were genotyped with Nextclade CLI (v2.13.0 & v2.14.0) partially Aliased assignments [47] and pangolin (v4.1.3 & v4.3) [48].

Distances of spike and RBD amino acid sequences to reference sequences

2.6.

We studied two physico-chemical weighted Hamming distances of a Spike amino acid (AA) sequence to the vaccine-insert sequence(s), computed for whole spike and for RBD. These distances were previously studied in COVID-19 vaccine trial sieve analyses [35,36]. The Prototype Vaccine Group boosters contain the Wuhan-Hu-1 reference sequence (GenBank accession number: MN908947), which is also referred to as Wuhan, Ancestral, or Prototype [49].

For the Omicron Vaccine Group, we also include three antibody escape score distances defined based on deep mutational scanning (DMS). The distances were computed based on antibodies in sera sampled from individuals infected with the Omicron XBB.1.5 SARS-CoV-2 strain in 2022–2023. The XBB.1.5 escape score values, calculated across full spike, were reported in Fig. 4a of Dadonaite et al. [44]. For each spike AA sequence from an Omicron Vaccine Group recipient who acquired a COVID-19 endpoint we calculate an antibody-escape score of the RBD sequence relative to the XBB.1.5 RBD sequence. The final distance used for the analysis is the weighted RBD Hamming distance to the XBB.1.5 RBD reference strain, modified such that if an AA site mutation away from XBB.1.5 matches the residue in Ancestral, then the mutation is removed by assigning the position a score of zero. In addition, if the mutation away from XBB.1.5 matches an individual participant’s vaccine-insert residue (or either vaccine insert for a bivalent vaccine), then it is removed by assigning a score of zero. This distance is motivated by observations from influenza antibody escape [50].

A potential limitation of the DMS-based distances is that many RBD AA positions have small escape scores close to zero, such that including all RBD positions may have more noise compared to distances that only include positions with escape score above a specified threshold. Accordingly, two additional distances are computed, calculated in the same way except only including the 14 RBD sites with greatest escape score from XBB.1.5 in Dadonaite et al. [44] listed in their Fig. 4a: 352, 357, 371, 375, 420, 421, 440, 447, 450, 455, 456, 473, 475, 485, which we refer to as the DMS-escape RBD-3 set of sites. Moreover, for ‘middle ground’ scores between inclusion of all RBD amino acid sites with escape score values (167 sites) and the 14 DMS-escape RBD-3 sites, we plotted the position-specific escape scores across all amino acid sites and sorted them, and searched for a cut-off that majorly reduced the total number of sites but not all the way down to the 14 DMS-escape RBD-3 sites, essentially looking for a kink in the convex hull of the plot. An escape-score cut-off of 0.7 was used to screen-in amino acid sites, which included 48 AA positions on which the DMS-escape RBD-2 distances were computed.

In sum, for the Prototype Vaccine Group, two distances were studied, the spike and RBD weighted Hamming distances to the Ancestral reference sequence, and for the Omicron Vaccine Group, these two distances were also studied using each participant’s Omicron vaccine-insert sequence as the reference sequence. In addition, for the Omicron Vaccine Group the three antibody escape score distances DMS-escape RBD-1, RBD-2, and RBD-3 were studied.

Day 15 neutralizing antibody markers and viral sequence distances analyzed for each of the two groups

2.7.

For Omicron Vaccine Group vaccine arms, the marker of interest is D15 log10 ID50 to Omicron BA.1, because BA.1 is the vaccine-insert lineage for vaccine arms 2, 4–6, 8–9, 12, 16. For vaccine arm 17, the marker of interest is D15 log10 ID50 to BA.4/BA.5, given the insert lineage is BA.4/BA.5. For Prototype Vaccine Group arms, the marker of interest is D15 log10 ID50 to D614G, because D614G, equal to Wuhan/Ancestral, is the insert strain. The analyses are done pooling Naïve and Non-naïve participants, and supplemental analyses are done only for Naïve participants.

Criteria for whether an analysis is included

2.8.

Acknowledging that the Sun et al. method [51] only provides worthwhile precision when there is sufficient variability in the mark/distance, each specified data analysis with this method is retained only if there are at least 10 unique mark/distance values represented among evaluable COVID-19 cases. Table 2 lists the analyses that are conducted, where the DMS-escape RBD-2 and RBD-3 analyses are conducted descriptively but not inferentially using Sun et al. because of the 6 and 2 unique values, respectively.

Statistical methods

Estimation of sequence-distance-specific correlates of risk of COVID-19

3.1.

The methods of Sun et al. [51] were used to estimate the AA-distance specific hazard ratio (HR) of COVID-19 per unit change in log10 D15 nAb titer (this HR corresponds to the hazard rate of COVID-19 with the given viral AA-sequence distance: the numerator hazard of second booster recipients with a given value of log10 D15 nAb titer, and the denominator hazard of second booster recipients with a log10 D15 nAb titer that is one unit smaller), implemented separately for the Prototype and Omicron Vaccine Groups. The methods estimate distance-specific HRs ranging over the spectrum of distance values using nonparametric kernel smoothing. The analyses right-censor participants at 188 days post D15 or at loss to follow-up if it occurred earlier. Results are presented as point and 95% confidence interval (CI) estimates of distance-specific HRs per unit change in D15 log10 nAb titer over all observed distance values. The methods of Sun et al. [51] were also used to estimate probabilities of distance-specific COVID-19 by 188 days post D15 (ranging over all distances) for subgroups defined by the 10th, 50th, and 90th percentile of the D15 nAb titer.

Hypothesis testing

3.2.

The methods of Sun et al. [51] were used to test two different hypotheses that distance-specific D15 nAb titer HRs varied with AA distance: The first is that the distance-specific HR departed from unity for at least one distance value (i.e., is nAb titer a CoR for some virus genotypes?), and the second is that the distance-specific HR increased with distance (i.e., does the strength of the CoR become weaker against viruses more distant from the reference virus?). Following Sun et al. the test statistics for evaluating these hypotheses are referred to as T1m and T2m, respectively.

Handling missing viral sequence data

3.3.

Of the 195 COVID-19 endpoints, 66 (33.8%) are missing viral sequence data, defined by any of the 223 residues in RBD having a missing value. Of the 66 endpoints with missing data, 39 generally did not have sequencing attempted because the cycle threshold value exceeded a threshold for attempting sequencing, and of the 27 partial sequences with missing content, 26 were missing more than 5% of the 223 residues and one was missing 7 residues (3.1%).

Hotdeck (predictive mean matching) multiple imputation is used to accommodate missing sequences, implemented as follows. For each COVID-19 case with a missing genotype, nearest neighborhoods of COVID-19 cases with viral sequence observed are defined based on z-scores of the calendar time of COVID-19 failure time (where the calendar time variable is calculated as the number of days from March 30, 2022). Ten imputed genotypes/marks from the 5-nearest neighborhoods are assigned. The hotdeck multiple imputation procedure is updated compared to that described in Sun et al. [51], to include an ABC bootstrapping of cases that provides proper multiple imputation.

All analyses adjust for baseline risk score and FOI standardized score. Analyses combining over Naïve and Non-naïve participants also adjust for naïve status. The Supplementary Methods provide additional details on the statistical methods, the implementation of the proposed estimation and hypothesis testing procedures, and the choice of bandwidths for kernel smoothing.

Results

Table 1 summarizes the COVAIL data included in the analyses and reports geometric mean D15 nAb titers for the antigens against which correlates were studied. For the Prototype Vaccine Group, the total sample size was 143 (100 non-cases, 43 cases), with 28 (65.1%) of the cases having SARS-CoV-2 sequence data. For the Omicron Vaccine Group, the total sample size was 744 (592 non-cases, 152 cases), with 101 (66.4%) of the cases having SARS-CoV-2 sequence data. The amount of data was greater for Moderna than Pfizer-BioNTech and for Naïve than Non-naïve. The geometric mean D15 nAb titers were lower in cases than non-cases, reflecting the published correlates results [34].

Table 3 summarizes baseline demographics, focusing on immunity-relevant factors, including stratification by Naïve and Non-naïve. For the Omicron Vaccine Group pooling over Naïve and Non-naïve, 24.9% of participants were age at least 65, 54.7% were female, and the top two vaccination histories were 3 doses of Pfizer-BioNTech mRNA (52.0%) and 3 doses of Moderna mRNA (39.0%), followed by 5–7% each receiving a mismatched mRNA vaccine manufacturer 2-dose primary series and first booster. A greater frequency of Naïve than Non-naïve participants were age at least 65 (34.2% vs. 9.8%) whereas sex and vaccination history had similar distributions between Naïve and Non-naïve. Also for the Omicron Vaccine Group, baseline geometric mean nAb titer against BA.4/BA.5 was 2329 overall, and 1418 and 5180 stratified by Naïve and Non-naïve, respectively, highlighting the immunity advantage of prior infection. Focusing on COVID-19 cases, covariate distributions did not appear different compared to the overall cohorts, except that baseline geometric mean nAb titers were lower in cases for both Naïve (geometric mean 1171) and Non-naïve (2674), which recapitulates the correlate of risk results previously reported [34].

For each of the five SARS-CoV-2 sequence distances defined in Methods, Fig. 1 shows distributions of the distances of COVID-19 endpoints for the Prototype and Omicron Vaccine Groups, stratified by Moderna/Pfizer-BioNTech booster and Naïve/Non-naïve. The Spike Hamming distances (Panels A, B) had median 33.2 and 24.8 AA mismatches to the closest vaccine strain for the Prototype and Omicron Vaccine Groups, respectively, showing how Omicron-matching moved the vaccine strains closer to the circulating strains. Restricting to the receptor binding domain (RBD), these medians were 17.3 and 8.0 AA mismatches, respectively. The deep mutational scanning (DMS) RBD-1 antibody escape score distances that considered whole RBD (see Methods) had median 1.73 for the Omicron Vaccine Group. A DMS-escape RBD-1 value of 0 represents the XBB.1.5 RBD sequence (no escape) and a value of x = 1.73 can be interpreted as an expected 1.73-fold reduced virus neutralization ID50 titer of sera from convalescent XBB.1.5-strain infected persons against that virus (geometric mean ID50 over individual sera) compared to against the XBB.1.5 reference virus [44]. For the Omicron Vaccine Group, the DMS-escape RBD-2 distances that considered a subset of RBD AA positions had median 1.57, whereas this median was 0 for the DMS-escape RBD-3 distances that further subsetted on the 14 RBD AA positions with greatest escape from XBB.1.5.

Variability of sequence distances across COVID-19 endpoint cases was a driver of statistical power for addressing the posed study question. The Spike Hamming distances were mostly concentrated between 30 and 35 (Prototype Vaccine Group) and between 22 and 27 (Omicron Vaccine Group), whereas the RBD Hamming distances had markedly more heterogeneity for the Omicron Vaccine Group (mostly concentrated between 4 and 12). The range of the DMS-escape RBD-1 distances across Omicron Vaccine Group COVID-19 endpoint cases corresponded to an expected range of nAb titer neutralization sensitivity reduction from no reduction to 3.02-fold reduction compared to the XBB.1.5 strain, and the range of the DMS-escape RBD-2 distances ranged from no reduction to 2.87-fold reduction [these calculations excluded the XB outlier with Hamming distance ~6 for DMS-escape RBD-1 and RBD-2, and ~ 4 for DMS-escape RBD-3 (Fig. 1E–1G)]. All but two of the 101 Omicron Vaccine Group COVID-19 endpoint cases had DMS-escape RBD-3 distance equal to 0, of which 99 were infected with the wildtype (no-escape) XBB.1.5 sequence ARFFDYKGNLFYAG at the 14 high-escape positions, 1 with ARFFDYKGDLFYAG that had XBB.1.5 antibody escape 2.84, and 1 with ARFFDYKGNLLYAG (escape score 4.21) (Table 4). Therefore, almost no circulating strains that caused COVID-19 infections had any of the 14 high-ranked antibody escape mutations from XBB.1.5 while also mismatching the Ancestral strain and the participant’s vaccine sequence(s). This implies insufficient variability to assess how the nAb titer CoR depends on DMS-escape RBD-3 distance, such that these analyses were not conducted; these inferential analyses were also foregone for DMS-escape RBD-2 given its limited variability (see Methods).

Fig. 1 also shows the lineages causing COVID-19 endpoint cases: 7 BA.2, 3 BA.4, and 18 BA.5 cases in the Prototype Vaccine Group, and 27 BA.2, 10 BA.4, 59 BA.5, 1 XZ, 2 XBB.1.1, and 2 XBB.1.5 cases in the Omicron Vaccine Group. Thus 69.8% of all cases were with BA.4 or BA.5 lineages and 26.4% with BA.2. Previously Zhang et al. [34] described the timing of lineage-specific COVID-19 for the original correlates study, with BA.2 dominant in April–May 2022 and BA.4/BA.5 dominant thereafter through the end of follow-up May 25, 2023.

Fig. 2 shows correlations of the viral sequence distances of COVID-19 endpoints for the Omicron Vaccine Group. The Spike and RBD Hamming distances were only moderately correlated [Spearman rank correlation (rho) =0.26]. The Spike Hamming distance was essentially uncorrelated with the three antibody escape score distances (all rho <0.30), whereas the RBD Hamming distance had some correlation with the DMS escape RBD-1 distance (rho = 0.68). The DMS escape RBD-1 and RBD-2 distances were highly correlated (rho = 0.85), which implies that results are expected to be similar for these two distances such that dropping the RBD-2 CoR analyses unlikely missed insights. For the Prototype Vaccine Group, the Spike and RBD weighted Hamming distances were highly correlated (rho = 0.86, Supplementary Fig. 1). The lower correlation of these Hamming distances for the Omicron Vaccine Group was due in part to the heterogeneity of Omicron vaccine-insert sequences (BA.1, BA.4.5).

For the viral sequence distances with sufficient variability across COVID-19 endpoint cases to support assessment of distance-specific CoRs, Fig. 3 shows the viral distances versus the D15 nAb titers for the Prototype and Omicron Vaccine Groups, pooling Naïve and Non-naïve participants (Supplemental Fig. 2 restricts to the Naïve cohort). Our targeted hypothesis of the CoR weakening with viral distance would garner some support from a result with higher ID50 titer associated with larger viral sequence distance. However, the Spearman rank correlations were all near zero (all estimates with absolute value ≤0.12 and with six of the seven 95% confidence intervals including zero).

Figs. 4 and 5 show the analysis of D15 nAb titers as distance-specific CoRs for the Prototype and the Omicron Vaccine Groups, for the pooled Naïve + Non-naïve cohorts and all distances qualifying for analysis based on sufficient variability (Supplemental Figs. 3 and 4 restrict to the Naïve cohort). In Fig. 4, while the small 2-sided p-values from the test statistic T1m indicated evidence that titer inversely correlates with COVID-19, our central interest is in the 2-sided p-values for T2m that test whether the correlate of risk weakens against viruses with greater viral distance. The p-values for T2m ranged from 0.16 to 0.49, not rejecting the null hypothesis. However, the smallest p-value (0.16) was for the distance with greatest variability and hence greatest power in the analysis (RBD Hamming distance), and the estimate of the distance-specific hazard ratio showed the trend with the strongest correlate against the shortest-distance viruses (HR = 0.60, 95% CI 0.39–0.92) and the weakest correlate against the longest-distance viruses (HR = 0.77, 95% CI 0.59–1.02). Fig. 5 shows the correlates results in a different way, showing the distance-specific Cumulative Incidence Function (CIF) rate for the value of D15 nAb titer fixed at the 10th, 50th, or 90th percentile value. The estimated curves being ordered from top to bottom moving from 10th to 50th to 90th percentile reveals the overall correlate of risk result also revealed by T1m as noted above. For our salient question, when comparing the CIF rate curves at the 10th and 90th percentile titers, we are looking to see whether the 10th vs. 90th percentile curves have greater departure at small distances than at large distances. This was potentially the case for RBD Hamming distance (Fig. 5C) but not for Spike Hamming distance (Fig. 5A, B) or DMS-escape RBD-1 (Fig. 5D).

Restricting to Naïve participants, the results were similar to those for pooled Naïve and Non-naïve participants (Supplementary Figs. 3 and 4). For example, the p-value for the CoR varying by RBD Hamming distance was 0.18 in Naïve participants (Supplementary Fig. 3C) compared to 0.16 in Naïve and Non-naïve participants pooled (Fig. 4C).

Discussion

Zhang et al. [34] showed that the nAb ID50 titer measured two-weeks after a second booster dose with a Moderna or Pfizer-BioNTech mRNA vaccine correlated with a lower risk of COVID-19, but did not investigate how SARS-CoV-2 Spike amino acid sequences influenced the immune correlate. This sequel project considered five Spike or RBD amino acid sequence metrics quantifying dissimilarity of the SARS-CoV-2 sequences causing COVID-19 endpoints to the nearest vaccine-insert sequence or the XBB.1.5 reference sequence, to evaluate whether and how the previously documented nAb titer correlate of risk depended on the sequence metrics. With precedents from HIV-1 and dengue vaccines [43,51], our hypothesis was that the correlate of risk would weaken against SARS-CoV-2 strains with greater distances, and testing this hypothesis for several sequence metrics would provide an in vivo technique for discerning which metrics were immunologically significant. Our main result is lack of evidence that the correlate of risk weakened with viral distance, with p-values >0.10 for the 10 inferential analyses that were conducted, where this result is consistent with the previous result in COVAIL that BA.4/BA.5 titer was not a superior correlate of BA.4/BA.5 COVID-19 compared to D614G titer [34].

We propose that the negative result can be explained by limited genetic and antigenic dynamic range of the sequence distances. Indeed, for the Omicron Vaccine Group the first two deep mutational scanning escape RBD distances of case-causing viruses had median values 1.73 and 1.57, respectively, with ranges from 0 to 3.02 and 0 to 2.87. These values reflect the fold-difference in neutralization sensitivity of two viral isolates to sera from XBB.1.5-strain infected persons, where the previous immune correlates analysis of COVAIL one-dose mRNA Omicron-containing second-booster recipients estimated that a 10-fold change in nAb titer corresponded to a COVID-19 hazard ratio of 0.77 for Naïve and 0.52 for Non-naïve participants (Fig. 4 in [34]). In contrast, a mere 3-fold change in nAb titer corresponds to a COVID-19 hazard ratio of 0.88 for Naïve and 0.73 for Non-naïve participants. These modest effect sizes translate into relatively low power to detect modification of the nAb titer correlate of risk by viral sequence distance in the context of COVAIL. Future correlates analyses may need to pre-specify thresholds of sequence variability or focus on periods of strain turnover to optimize signal detection. Additional support for our proposed explanation stems from the finding that the smallest p-values (p = 0.16, 0.18) were for the sequence metric with greatest dynamic range – the RBD Hamming distance – with the point estimate hazard ratio per 10-fold change in D15 nAb titer changing from 0.60 to 0.77 against viruses with the shortest and longest distances to the Omicron-insert sequence, respectively. This suggests that a study with more participants or including more antigenically diverse viruses might demonstrate weakening of the correlate of risk with increasing distance. RBD Hamming distance has been previously shown to associate with vaccine efficacy, which diminishes against viruses more distant to the vaccine strain [36,52], although the trial-level analysis [52] was implemented before Omicron emerged when the range of RBD amino acid mismatches to the vaccine strain was only 0–3 compared to 0.4–18.6 mismatches in the present study. Moreover, 99 of the 101 (98.0%) COVID-19 endpoint cases in the Omicron Vaccine Group had an exact AA sequence match to XBB.1.5 in the 14 greatest-escape positions comprising the DMS-escape RBD-3 distance [44]. These frequencies highlight the limited antibody escape from XBB.1.5 among the circulating viral sequences that caused infections in COVAIL.

In addition to its widest dynamic range, the RBD Hamming distance had greater sensitivity to detect a sequence distance-dependent correlate of risk than the antibody escape score distances because the Hamming distances were calculated to the (nearest) vaccine-insert sequence as reference whereas the escape score distances used the XBB.1.5 sequence as reference, and the nAb titer utilized in the analysis was measured to the vaccine-insert sequence but not against XBB.1.5. The most statistically powerful version of the analysis uses a vaccine-insert reference sequence for the study endpoint sequence distances paired to measurement of the immune response against the vaccine-insert antigen, given that vaccines elicit their highest responses against homologous sequences, and this ideal analysis was conducted for the aforementioned HIV-1 and dengue analyses that detected signals. The ideal antibody escape score analysis would have run the deep mutational scanning experiments for each of the two Omicron vaccine-insert sequences BA.1 and BA.4/BA.5. However, since these experimental data were not available, due to the resource-intensity of deep mutational scanning experiments, the best available data were used instead, which were derived from the circulating Omicron variant XBB.1.5. The closer XBB.1.5 matches the Omicron vaccine-insert strains, the less noise in the analysis and the greater approximation of the implemented analysis to the ideal analysis. The previous antigenic cartography analysis showed that Stage 1 Moderna vaccinee sera at D15 had a geometric mean nAb titer against XBB.1.5 about 10-fold lower than against BA.1 or BA.4/BA.5 [23], indicating substantial noise in the implemented analysis.

We consider limitations of this study. First, as noted above the genetic and antigenic variability of circulating strains during the 13-month calendar follow-up period limited statistical power for detecting distance-specific correlates of risk. This limitation underscores the significance of studying the dynamic range of circulating strain characteristics over a given calendar period in a given population for informing various decisions such as the timing of updating booster strains. Second, as stressed above, the deep mutational scanning experimental data were available for the XBB.1.5 reference strain but not for each Omicron vaccine-insert strain (BA.1, BA.4/BA.5) – availability of these data would have enhanced the precision and power for detecting whether the nAb titer correlate of risk weakened with escape score distance. Third, there were insufficient data to study distance-specific correlates of risk separately by vaccine manufacturer and separately for Non-naïve participants, with the latter limitation significant given that the nAb titer correlate of risk was stronger in Non-naïve participants [34]. Fourth, ideally a study would evaluate immune correlates for severe/hospitalized COVID-19 for maximal clinical significance, which was not possible given the rarity of this outcome in a fully immunized study population. Fifth, 34% of the COVID-19 endpoint cases were missing viral sequences, necessitating statistical methods to account for this gap. Sixth, this study only evaluated nAb titer, and other immune responses could be playing a larger role in protecting against variant COVID-19 perhaps negating to see associations with titer. The literature is replete with results supporting that low neutralizing antibody titer marks vulnerability to COVID-19, and some literature also support that certain high-quality and quantity T cell responses may be able to compensate for this vulnerability (e.g., the “swiss cheese model” [53]). Accordingly, had sufficient T cell data been available from COVAIL participants, it would be interesting to investigate whether the nAb titer distance-specific correlate of risk had stronger attenuation with distance for individuals with poor T cell response. Lastly, given that neutralizing epitopes of SARS-CoV-2 are generally conformational rather than linear, and the distances considered were not designed to target conformational epitopes, future research could pursue more structurally/conformationally relevant viral distances.

Strengths of this research project include a large number of COVID-19 endpoints and use of a validated pseudovirus neutralization assay conducted against multiple strains including those in the vaccines tested as well as those in circulation during the study. While 34% of the COVID-19 endpoint cases were missing viral sequences as stated above, an additional strength of the present study is that we designed statistical methods to account for this fact, with predictive mean matching/hotdeck imputation performed to limit bias and improve efficiency.

In conclusion, these findings underscore the value of integrating viral sequence data into immune correlate analyses, while also highlighting the challenge of doing so when there is limited antigenic diversity among circulating strains during the trial period. Specifically, the genetic and antigenic heterogeneity in the United States from April 2022 to May 2023 during the correlates analysis follow-up period of COVAIL appeared to be too limited to alter the nAb titer correlate of risk, reflecting only an approximately 3-fold range of differential nAb titer sensitivity to sera. Additionally, the trend of the neutralizing antibody titer correlate of risk being stronger against strains better matched in their RBD amino acid sequence to their vaccine-insert sequences is likely a real result predicting that if there had been a substantially greater dynamic range of circulating sequences than this study would have detected it. These results inform deliberations about the extent of genetic and antigenic-specificity of SARS-CoV-2 viral evolution that must occur before updating the booster-strain is predicted to substantially influence the level of booster-protection.

Supplementary Material

MMC1

Bibliography53

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Pather S, Muik A, Rizzi R, Mensa F. Clinical development of variant-adapted BNT 162b 2 COVID-19 vaccines: the early omicron era. Expert Rev. Vaccines 2023;22:650–61.37417000 10.1080/14760584.2023.2232851 · doi ↗ · pubmed ↗
2US Food and Drug Administration. Coronavirus (COVID-19) Update: FDA Recommends Inclusion of Omicron BA.4/5 Component for COVID-19 Vaccine Booster Doses. Press Release/Public Statement; 2022. https://www.fda.gov/emergency-preparedness-and-response/coronavirus-disease-2019-covid-19/covid-19-vaccines.
3US Food and Drug Administration. Updated COVID-19 Vaccines for Use in the United States Beginning in Fall 2023. Press Release/Public Statement; 2023. https://www.fda.gov/emergency-preparedness-and-response/coronavirus-disease-2019-covid-19/covid-19-vaccines.
4Pather S, Madhi SA, Cowling BJ, Moss P, Kamil JP, Ciesek S, SARS-Co V-2 omicron variants: burden of disease, impact on vaccine effectiveness and need for variant-adapted vaccines. Front Immunol. 2023;14:1130539.37287979 10.3389/fimmu.2023.1130539 PMC 10242031 · doi ↗ · pubmed ↗
5Link-Gelles R, Ciesla AA, Fleming-Dutra KE, Smith ZR, Britton A, Wiegand RE, Effectiveness of bivalent m RNA vaccines in preventing symptomatic SARS-Co V-2 infection - increasing community access to testing program, United States, September-November 2022. MMWR Morb. Mortal Wkly. Rep. 2022;71:1526–30.36454688 10.15585/mmwr.mm 7148 e 1PMC 9721148 · doi ↗ · pubmed ↗
6Zeng B, Gao L, Zhou Q, Yu K, Sun F. Effectiveness of COVID-19 vaccines against SARS-Co V-2 variants of concern: a systematic review and meta-analysis. BMC Med. 2022;20:200.35606843 10.1186/s 12916-022-02397-y PMC 9126103 · doi ↗ · pubmed ↗
7Buchan SA, Chung H, Brown KA, Austin PC, Fell DB, Gubbay JB, Estimated effectiveness of COVID-19 vaccines against omicron or Delta symptomatic infection and severe outcomes. JAMA Netw. Open 2022;5:e 2232760.36136332 10.1001/jamanetworkopen.2022.32760 PMC 9500552 · doi ↗ · pubmed ↗
8Andrews N, Stowe J, Kirsebom F, Toffa S, Rickeard T, Gallagher E, Covid-19 vaccine effectiveness against the omicron (B.1.1.529) variant. N. Engl. J. Med. 2022;386:1532–46.35249272 10.1056/NEJ Moa 2119451 PMC 8908811 · doi ↗ · pubmed ↗