Design of solubly expressed miniaturized SMART MHCs
William L. White, Hua Bai, Chan Jhong Kim, Kevin M. Jude, Renhua Sun, Laura Guerrero, Xiao Han, Xiaojing Tina Chen, Apala Chaudhuri, Julia E. Bonzanini, Yi Sun, Amarachi E. Onwuka, Nan Wang, Chunyu Wang, Per-Åke Nygren, Xinting Li, Inna Goreshnik, Aza Allen, Paul M. Levine

TL;DR
Scientists designed a new type of MHC protein that can be produced in bacteria and retains the ability to bind T-cells and peptides, helping study immune responses.
Contribution
The novel SMART MHCs replace β2m and α3 domain, enabling soluble expression and maintaining functional binding properties.
Findings
SMART MHCs retain peptide- and TCR-binding specificity.
Peptide-bound SMART MHC structures resemble native MHCs.
SMART MHCs can be produced in E. coli in soluble form.
Abstract
The precise recognition of specific peptide-MHC (pMHC) complexes by T-cell receptors (TCRs) plays a key role in infectious disease, cancer and autoimmunity. A critical step in many immunobiological studies is the identification of T-cells expressing TCRs specific to a given pMHC antigen. However, the intrinsic instability of empty class-I MHCs limits their soluble expression in Escherichia coli and makes it very difficult to characterize even a small fraction of possible pMHC/TCR interactions. To overcome this limitation, we designed small proteins which buttress the peptide binding groove of class I MHCs, replacing β2-microglobulin (β2m) and the heavy chain α3 domain, and enable soluble and partially soluble expression in E. coli of H-2Db and A*02:01, respectively. We demonstrate that these soluble, monomeric, antigen-receptive, truncated (SMART) MHCs retain both peptide- and…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsT-cell and B-cell Immunology · vaccines and immunoinformatics approaches · Immunotherapy and Immune Responses
Introduction
Recombinantly expressed peptide-major histocompatibility complexes (pMHCs) are widely used as staining reagents to identify or isolate T-cell subsets that recognize a peptide of interest^1^. They are often used to study T-cell specificity^2–4^, infectious disease^5^, autoimmunity^6,7^, and cancer immunology^8–10^. Recombinant pMHCs are also critical in determining the structures of peptide/MHC and pMHC/T-cell receptor (TCR) complexes^2,11,12^. All these discoveries were made despite the significant difficulties involved in producing the soluble pMHCs necessary for the underlying biophysical, structural, and functional experiments. The pMHC production process, which involves separate expression in E. coli of the two MHC chains as insoluble inclusion bodies, solubilization, and refolding in the presence of the desired peptide^13^, is expensive, slow, and inefficient. To reduce the burden of refolding, systems have been developed wherein a single refolding reaction is split and loaded with many different peptides^14–17^; these methods have increased the number of peptide variants that can be studied, but are still limited by the need for refolding. Eukaryotic expression systems that fold the MHC structure natively^18–22^ have been used in peptide library screens where peptide variants are fused to the MHC and displayed on the cell surface, but can be limited in library size or expression levels.
The difficulties in producing pMHCs likely stem from the inherent instability of the MHC molecule when either a peptide or the β2m subunit is absent. A more stable MHC-like molecule that could be readily expressed in E. coli or yeast would enable the study of peptide-specific T-cell populations at a much larger scale. Instead of relying on specialized facilities to produce refolded pMHCs for staining experiments^13^, immunologists could produce them in-house, dramatically improving their ability to iterate through multiple peptide variants or MHC alleles. A stabilized native-like MHC could also facilitate screening of large peptide libraries by yeast display without the need for prior optimization of the MHC sequence for display.
We reasoned that such a molecule could be created by leveraging the stability of de-novo designed proteins and recent advances in protein-protein interface design^23^ to replace portions of the MHC with a designed protein scaffold. We set out to design soluble, monomeric, antigen-receptive, truncated (SMART) MHC molecules that replace the α3 domain and the β2m subunit with a small designed protein domain that buttresses the peptide binding groove of the pMHC. These stabilizing domains should preserve the native peptide- and TCR-binding properties, and allow soluble expression in the absence of a peptide without the need for refolding, providing a peptide-receptive MHC that can be loaded with arbitrary peptides.
Design of SMART MHCs
Native class I MHCs are composed of a heavy chain, the β2m subunit, and a peptide^24^. The heavy chain α1 and α2 domains consist of a β-sheet supporting two α-helices which create a peptide binding groove that defines the peptide binding specificity of each MHC-I allomorph and facilitates interactions with TCRs^24^. The heavy chain α3 domain, and the β2m subunit, are more membrane proximal and function as structural support for the α1 and α2 domains^24^. The α3 domain additionally provides a binding site for the CD8 co-receptor on T-cells^25^. Since class I pMHC structures are well-conserved, we selected the mouse H-2D^b^ allele as a representative in our stabilizing design process (fig. 1A).
We first removed as much of the structure as possible without disrupting the ability of the MHC to present peptides and interact with TCRs. Following the precedent of previously published truncated “mini-MHC” systems, we removed the α3 and β2m domains completely^11,26,27^ (fig. 1B, left). The truncated MHC heavy chain (residues 1–179) contains a hydrophobic patch on the underside of its β-sheet which we targeted for our stabilizing domain design. We used methods developed for protein binder design^28^ to create this de-novo stabilizing domain. We collected a set of candidate backbone structures for the stabilizing domain (fig. 1B, left), docked these backbones against the truncated H-2D^b^ structure and designed their sequences to create favorable contacts with the MHC, replacing the interactions made by the deleted α3 domain and β2m subunit (fig. 1B, middle). Finally, we linked each designed stabilizing domain to the N-terminus of the truncated MHC by a poly-GGS linker (fig. 1B, right).
We screened a total of about 10^4^ designs generated by this method using yeast surface display^29^. We sorted for designs that enabled the expression of truncated H-2D^b^ on the yeast surface (fig. S1A), and were able to bind to a FITC-labeled gp33 peptide (FITC-gp33) (fig. S1B), which is bound strongly by native H-2D^b 30,31^. Expression sorting provided only slight separation of our designs from sequence-scrambled negative controls, while peptide sorting created clear separation between successful designs and controls (fig. 1C–D). Interestingly, designs containing a Trp residue placed similarly to W60 of β2m in the native H-2D^b^/β2m/gp33 structure were positively enriched, aligning with previous observations that mutations at W60 significantly destabilize the interaction between β2m and the MHC heavy chain^32,33^.
We identified the 30 designs with the highest peptide-binding enrichment and expressed them in E. coli. The best performing design (hit6; fig. 1E, top) enabled partial folding of empty MHCs in both yeast and E. coli expression systems. However, in E. coli only a fraction of the hit6 protein was soluble, and was susceptible to proteolysis, resulting in very low yields (fig. 1F; table S7). To improve hit6, we identified likely cleavage sites based on the masses of the proteolytic fragments, and redesigned the amino acid sequence near those sites, keeping the amino acids that directly interact with the peptide or TCR fixed, and biasing towards amino acids that occur frequently in other MHC alleles (fig. S2). We chose the cleavage site mutant (CSM) with the lowest degree of proteolysis: CSM8 (fig. 1E, bottom). To distinguish between these variants, data in all figures are colored teal or yellow to indicate hit6 or CSM8, respectively.
We tested the expression of the SMART constructs in a cell-free expression (CFE) system compatible with a variety of high-throughput screening methods^34–42^. Engineered CFE systems can produce high protein yields^40^ and create oxidizing environments for disulfide bond formation, which is needed for MHC folding^41–43^. The CFE results confirmed the E. coli results, demonstrating that, in contrast to the native H-2D^b^, the stabilized designs can be expressed solubly, and that CSM8 H-2D^b^ is expressed more efficiently than hit6 H-2D^b^ (fig. 1G). These results indicate that SMART MHCs can be expressed in a variety of systems that are not compatible with soluble expression of native MHC-I molecules.
SMART H-2Db retains native binding properties and structure
To verify that the soluble, peptide-free CSM8 H-2D^b^ material produced in E. coli retained the functional characteristics of the native MHC, we first measured the binding affinity of CSM8 H-2D^b^ for FITC-gp33 using fluorescence polarization (FP). These measurements demonstrated that CSM8 H-2D^b^ binds FITC-gp33 with an apparent dissociation constant (K_D,app_) below 1nM (fig. 2A); this value is lower than the previously reported value of 21nM for native H-2D^b 47,48^, likely because a competitor peptide was included in the native measurement.
Next, we assessed the binding affinity of the H-2D^b^/gp33-specific P14 TCR to CSM8 H-2D^b^ loaded with three variants of the gp33 peptide that bind to H-2D^b^ with similar affinity but altered recognition by the P14 TCR^30,46^ (fig. 2B). Surface plasmon resonance (SPR) measurements revealed TCR binding affinities similar to native H-2D^b^/peptide complexes for all three variants^46^ (table 1; fig. S3; fig. 2C), further indicating that our stabilizing domain maintains the peptide-binding domain and peptide in a native-like conformation.
As a final validation of our design, the crystal structure of the CSM8 H-2D^b^/gp33 complex was determined to 2.0Å resolution. This high resolution allowed us to analyze the interactions formed between the peptide binding cleft and the stabilizing domain underneath, as well as to compare the conformation of the presented peptide, and of residues known to be essential for TCR recognition, with the native H-2Db molecule presenting the same epitope^31,47^. The structures of the peptide binding groove and full CSM8 H-2D^b^ design closely matched the native structure and design model, with Cα-RMSDs of 0.60Å over 176 atoms, and 0.70Å over 233 atoms, respectively (fig. 2D,E). Examination of the region surrounding the β2m tryptophan residue W60 revealed that the Trp sidechain takes a similar conformation in the CSM8 design model and both crystal structures (fig. 2F). Furthermore, the peptide backbone aligns well with the native conformation, and the side chains display only minor changes between the two crystal structures (fig. 2G). These small changes, along with a few minor shifts elsewhere in the CSM8 H-2D^b^ structure could explain the slight deviations in binding affinities observed in our SPR experiments. Overall, these results demonstrate that CSM8 H-2D^b^ retains the structural features that allow peptide and TCR binding, without the need for refolding.
CSM8 HLA A*02:01 displays partial solubility and stability
Next, we tested the generalizability of our stabilizing domain to other MHC allomorphs. Based on the structural and sequence similarity of H-2D^b^ to many HLAs we did not introduce any modifications to the stabilizing domain, varying only the MHC sequence. We selected HLA A02:01 due to its high frequency across different ethnic groups^48^, and the large number of well-characterized peptide epitopes and TCRs that bind to it^49,50^. Although the CSM8 A02:01 construct could be expressed in a soluble form in E. coli, a portion of the soluble material was in a dimeric (and likely misfolded) state, which reappeared after isolation of the monomeric fraction (fig. 3A).
To test the ability of CSM8 A02:01 to present peptides, we selected the tumor-associated antigen, NY-ESO-1^51^ and the well-characterized NY-ESO-1/A02:01-specific TCR, 1G4^50^. We used FP methods very similar to those used to evaluate SMART H-2D^b^ to measure the K_D,app_ of NY-ESO-1 to CSM8 A02:01 and found it to be well above 1μM (fig. 3B), much weaker than the 40nM affinity measured for the native A02:01 using competition binding assays^55^. The TCR binding affinity was similarly impacted, resulting in roughly 15–20 fold weaker binding than reported values for native A*02:01^50^ (table 2; fig. S4A,B).
We reasoned that these differences in affinity resulted from a lowered effective concentration due misfolding and dimerization. Therefore, we prepared CSM8 A02:01/NY-ESO-1 complexes by standard refolding procedures and measured their affinity for the 1G4 TCR. We found that these affinities were much closer to the expected native affinities (table 2; fig. 3C; fig. S4C,D), although they were still weaker by roughly 2–3 fold. The improvement in affinity achieved by refolding, and the similar ratio of affinities between the two peptide variants across all formats, indicates that CSM8 A02:01 can maintain a native-like structure, but only for a small fraction of molecules in solution. Overall, these data suggest that our stabilizing domain generalizes weakly to new allotypes, providing limited stabilization and solubility to A*02:01.
The SMART stabilizer facilitates yeast display of A*02:01
As noted above, yeast display can be a powerful tool to screen libraries of peptide variants for TCR binding^19,22^. Thus, we evaluated the ability of SMART A02:01 to express on the yeast surface and bind a known TCR. We compared hit6 and CSM8 versions to assess the impact of the cleavage site mutations in this expression system. We fused the TAX9 peptide^53^ to both variants, including the W167A MHC mutation to accommodate the peptide linker^19^ (fig. S5B). We compared the SMART A02:01 designs to native A02:01 single-chain trimers (SCT) (fig. S5A) and found that it is displayed at higher levels, and achieved similar levels of binding to the high-affinity A6c134 TCR^49,53^ without binding to unrelated TCR or streptavidin (SA) controls (fig. 3D,E; fig. S5). Hit6 but not CSM8 was stained by an anti-A02:01 antibody, suggesting that both variants are properly folded, but the cleavage site mutations prevent antibody recognition. Thus, SMART A02:01 enhances yeast expression levels of HLA A02:01 while retaining TCR binding specificity.
CSM8 A*02:01 presents TAX9 peptide to a TCR in a native-like manner
To investigate whether CSM8 A02:01 presents peptides and interacts with TCR in a native-like way, we determined the crystal structure of CSM8 A02:01/TAX9 complexed with the A6c134 TCR. As in our SPR experiments with this allotype, refolding methods were necessary to make sufficient quantities of monomeric CSM8 A02:01 for crystallography. We found excellent agreement between CSM8 and native A02:01 within the HLA (Cα-RMSD 0.40 Å over 161 atoms) and TAX9 peptide (all-atom RMSD 0.51 Å) (fig. 4A). CSM8 A*02:01 conserves the extensive hydrogen bond and hydrophobic contact network to the C- and N-termini of the TAX9 peptide (fig. 4B, S6).
We further found that the TCR variable domains engage CSM8 A02:01 nearly identically to native A02:01 (Cα-RMSD 0.54 Å for 69 atoms in the complementarity determining region (CDR) loops). Though not all side chains are clearly visible in the electron density, those that are observed suggest that most or all residue contacts from the native complex are retained. In particular, we observe hydrogen bonds from the TCR β chain to the α2 helix of CSM8 A02:01 including Glu102^β^ to H221 and Ala101^β^ to Ala220. At the α2 helix, the TCR α chain contributes hydrogen bonds from Asp99^α^ and Thr98^α^ to Arg135 and Gln30^α^ to K136 and also an extensive hydrophobic interaction between Trp101^α^ and the surface formed by Gln132, Ala139, and Lys138 (fig. 4C). A6c134 also makes four hydrogen bonds to TAX9: Glu30^β^ to Tyr8, Ser100^α^ to Gly4, Ser31^α^ to Tyr5, and a mainchain-mainchain contact between Ser100^α^ and Gly4 (fig. 4D). The peptide conformation in the CSM8 and native A02:01/TAX9/A6c134 crystal structures is conserved, with an all-atom RMSD of 0.479 Å (fig. 4E). The close agreement of the CSM8 and native structures, in conjunction with peptide and TCR binding data, suggests that, despite issues with soluble expression, CSM8 A02:01 can present peptides in the same conformation as native A02:01.
Rigid linker design improves expression and maintains peptide and TCR binding
Although we were able to purify soluble material for SMART versions of both H-2D^b^ and A*02:01, yields were lower than desired (table S7), likely due to aggregation during expression. We hypothesized that this aggregation was due, in part, to the flexibility of the poly-GGS linker connecting the stabilizer to the MHC. Thus, we used inpainting, ProteinMPNN, and AlphaFold2^54–56^ to generate rigid linkers to replace the flexible linker in CSM8. We refer to the most soluble of these improved designs as CSM8-L11 throughout the remainder of the text (denoted by purple coloring). As intended, the rigid linker significantly improved yields in both E. coli and CFPS expression systems and reduced aggregation for H-2D^b^ without impacting the gp33 binding affinity (table S7, fig. S7).
We also found CSM8-L11 H-2D^b^ to be highly shelf-stable, retaining strong peptide binding affinity after storage for at least one month at 4°C, and showing only a minor decrease in affinity following multiple freeze/thaw cycles (fig. S8A–B). To further assess peptide binding and stability, we measured circular dichroism (CD) spectra and melting curves for CSM8-L11 H-2D^b^ in the presence and absence of the gp33 peptide. The CD spectra are consistent with the mixed αβ fold of the design model (fig. S8C), while the melting curves demonstrate a clear stabilization of the fold in the presence of the peptide (fig. S8D).
After confirming that the rigid linker improved solubility and stability of H-2D^b^, we tested its ability to improve these properties in A02:01. We found that the yields were comparable (table S7), although CSM8-L11 A02:01 was somewhat more prone to aggregation than CSM8 A*02:01 (fig. S9A). Despite the increased aggregation, CSM8-L11 showed a slight improvement in K_D,app_ when binding to the NY-ESO-1 peptide, and similar A6c234 TCR binding in yeast display experiments (fig. S9B–E). In combination with the improvements observed in H-2D^b^, these results suggest that the rigid linker represents an overall improvement to SMART MHCs, and therefore focused on this improved design in subsequent experiments.
CSM8-L11 solubilizes several additional common HLA allomorphs
Given the improvements in expression observed with CSM8-L11, we expressed 15 human HLA allomorphs using this stabilizing domain in E. coli and performed small-scale purification using HPLC. CSM8-L11 versions of many allomorphs exhibited peaks in the expected monomeric range, though many also displayed dimeric and aggregate peaks (fig. 5A). Notably, HLA A03:01 and HLA A01:01 demonstrated high protein expression, and HLA B07:02 showed the highest monomeric fraction (fig. 5B). The presence of a significant dimeric or aggregated population, particularly in allomorphs with high expression, indicates that the stabilizer was ineffective at simultaneously promoting high protein expression and monomeric behavior across diverse HLAs. However, four allomorphs showed increased expression, and three showed reduced dimerization compared to CSM8-L11 A02:01, suggesting potential for improvement through minor redesign.
We next compared the expression of native, full-length HLAs with their CSM8-L11 versions, with and without genetically linked peptides, using CFE. Soluble protein levels were measured using radiolabeled ^14^C-Leucine incorporation (fig. 5C). About 93% (27/29) of SMART HLA constructs showed increased soluble expression versus their full-length counterparts, with an average 5.6-fold improvement and several exceeding 10-fold. Enhanced solubility in CFE may result from reduced crowding, a more oxidizing environment, or DsbC chaperone activity. Together, E. coli and CFE data suggest that while the stabilizing domain aids solubilization, further optimization is needed to reduce aggregation and enhance generalizability across HLAs.
Peptide-fused SMART MHC oligomers stain T-cells in a TCR-specific manner
An important application of pMHCs is the staining and identification of T-cells using pMHC tetramers^1^. We therefore tested whether CSM8-L11 H-2D^b^ could be converted into a similar multimeric staining reagent. Rather than using SA to tetramerize our designs, as is typically done, we chose to directly fuse them to a de-novo designed oligomeric protein which assembles into a tetrahedral architecture containing 12 subunits^57^. This allowed us to omit the biotinylation step necessary for SA-based tetramerization^58^. To improve folding and assembly of oligomeric CSM8-L11 H-2D^b^, we fused a peptide of interest to the N-terminus of the construct along with a SUMO tag. Co-expression of the Ulp1 protease allows the N-terminus of the peptide to be cleanly cleaved, enabling it to bind properly in the peptide binding groove. The Y84A mutation in the H-2D^b^ sequence was used to accommodate the peptide linker ^59^. We fused a Myc tag to the C-terminus of the oligomer to allow for antibody staining (fig. S10A–C).
To assess the ability of the CSM8-L11 H-2D^b^ MHC oligomers to stain T-cells, we mixed two populations of Jurkat T-cells: one expressing the P14 TCR (fig. 6A) and the other expressing the unrelated TCR, OT-I^60^ (fig. S10D,E). To distinguish the two cell lines independently of TCR staining, we labeled the OT-I Jurkats with CFSE dye. We fused CSM8-L11 H-2D^b^ oligomers to gp33 peptide variants with known affinities to the P14 TCR^30,46^ and assessed their ability to stain the T-cell mixture. Our T-cell staining data reflected previously observed trends in native pMHC/TCR binding^30,46^; the highest affinity variant (V3P) showed the brightest staining, with staining intensity decreasing with decreasing affinity (fig. 6; fig S10). However, the Y4F variant showed clear staining in our experiments, despite prior work showing that the P14 TCR does not recognize this peptide^46^. This non-native behavior matches our SPR measurements showing weak binding of the P14 TCR to CSM8 H-2D^b^/Y4F (fig. 2C), indicating that this behavior is consistent across experimental contexts. The detection of binding to Y4F in both experimental settings suggests that this difference from native behavior likely results from minor structural differences between SMART and native H-2D^b^, rather than specific features of the oligomeric format used here. Nevertheless, staining was both TCR- and peptide- specific, suggesting that CSM8-L11 H-2D^b^ oligomers could be used alongside pMHC tetramers to more easily identify and isolate pMHC-specific T-cells.
Discussion
Our success in creating a single stabilizing domain that allows soluble expression in E. coli of H-2D^b^ and partial solubilization of A02:01 and other HLAs provides a strong foundation for a generalizable stabilizing scaffold for other MHC-I allomorphs. Our structural and biochemical data confirm that SMART H-2D^b^ maintains critical peptide and TCR interactions, while SMART A02:01 shows weakened binding interactions. Our structural data also show that, for both allomorphs, the conformation of the presented peptide is nearly identical to the native conformation, and that for refolded CSM8 A02:01, the TCR binds through the same set of interactions in both the native and SMART complexes. Recent work has also demonstrated that the reduced size of refolded SMART A02:01 facilitates rapid characterization of peptide- and TCR- MHC interactions via NMR^61^. Thus, SMART MHCs can provide an accessible complement to conventional MHCs, bypassing the need for refolding and improving stability for H-2D^b^, and providing a convenient miniaturized form for A*02:01.
In addition to their utility in biophysical and T-cell staining experiments, SMART MHCs could improve high-throughput measurements of pMHC/TCR interactions. Their high expression levels on the yeast surface could expedite screening of pMHC/TCR interactions in yeast display experiments. Additionally, the soluble expression of SMART MHCs in cell-free systems opens new routes to characterize pMHC-TCR interactions in high throughput by taking advantage of the scalable nature of CFE^34,38,62,63^. Finally, the ability to produce a tetramer-like T-cell staining reagent without labor-intensive refolding protocols has the potential to reduce the barriers to performing T-cell tracking or sorting experiments. The SMART system could dramatically increase the number of peptide and MHC variants that can be tested, enabling improved tracking of immune responses to infectious disease, identification of cancer-targeting T-cell clones, and many other applications. To fully realize this potential, the SMART design will need further improvement, as many HLAs are not expressible in functional form with the SMART domain fusion (fig. 5). With such improvements, SMART MHCs have the potential to rapidly accelerate our understanding of T-cell behavior and TCR specificity.
Limitations and future work
The SMART MHCs presented here are prototypes for a new strategy to produce small, modular MHC proteins; further refinement of these designs will be necessary before they can be used as general reagents with broad utility across MHC allotypes. While we were able to express a significant soluble fraction of H-2D^b^ in the cytosol of E.coli, our results with A02:01 and other HLAs were less successful. Furthermore, only a fraction of soluble A02:01 is in a peptide-receptive state, and this allotype was prone to dimerization and showed weakened binding affinities. While SMART A*02:01 showed display on yeast by TCR staining, library generation for TCR selection with this scaffold has not yet been achieved. Other SMART HLAs showed signs of similar liabilities, though they were not tested extensively. To overcome these limitations and make SMART MHCs broadly useful in a variety of biochemical and immunological applications, next generation versions are focused on screening for stabilizers that promote both peptide and TCR binding across diverse MHC allotypes.
Methods
Stabilizer library design
We used previously developed computational methods to design protein binders for arbitrary target proteins^28^ to design this stabilizing domain. The “target” supplied to this method was the α1 and α2 domains of the H-2D^b^ structure (PDB: 1S7U). Briefly, we docked scaffolds to the underside of the truncated structure and designed favorable interactions, including some taken from the native interactions with β2m. Designs were filtered for the quality of the stabilizer and the interactions it made with the MHC, as previously described^28^. We also included a set of negative control designs made by randomly scrambling the sequence (while conserving the pattern of hydrophobic and hydrophilic residues) of a random subset of the designs. A detailed description of this protocol is provided in https://github.com/wlwhite-tufts/code-for-SMART-MHC-manuscript.
Yeast display screening of stabilizing domains
Yeast display was performed as previously described^28^ with the following changes. Rather than screening our designs for binding to the α1 and α2 domains of H-2D^b^, we fused the designs to those domains using a flexible poly-GS linker and screened them for surface display and binding to a FITC-labeled gp33 peptide (KAVYNFATM), with FITC linked to the amine on the lysine sidechain. Sorted populations were subsequently cultured and plasmid DNA was extracted for sequencing. FITC-gp33 was purchased from GenScript as a custom peptide synthesis.
Cell-free gene expression (CFE) and radioactive quantification of soluble protein yields
CFE reactions were prepared with a version of the existing PANOx-SP formulation that uses glutathione and DsbC to create an oxidizing environment^35–37^, and ^14^C-leucine for later measurements of yield. All components were added together on ice^38,40^. This final reaction mixture was added to 6.66% v/v unpurified linear DNA PCR products (LETs) in triplicate^63,64^, and incubated at 30°C overnight. Samples were spun at 16,000×g for 10 minutes at 4°C to separate total and soluble protein fractions. Supernatants were incubated at 37°C for 20 minutes with 0.25 N KOH. Fractions were spotted on a 96 well filtermat (Revvity 1450–421), dried, washed, and coated with wax, and protein yields were quantified with a scintillation counter. Further details are provided in the supplemental methods.
Cleavage site mutation design
Cleavage site design was used to make mutations to the MHC sequence to reduce proteolysis without impacting peptide or TCR binding. First, a multiple sequence alignment (MSA) of MHC protein sequences was collected using PSI-BLAST^65^. Next, the MSA was converted into a position-specific score matrix (PSSM) denoting the likelihood of observing each possible amino acid at each position in the MSA. Standard Rosetta design protocols^66^ were modified to allow mutations to amino acids that had likelihoods above a specified cutoff, and to only allow mutations at sequence positions near the pre-identified cleavage sites. Design was further restricted to prevent mutations in residues with sidechains that could interact with a bound peptide or TCR. All CSMs were designed based on the hit6 design model, and the highest scoring designs were selected for experimental testing based on a combination of Rosetta metrics relating to the quality of the design model and strength of the interface. A detailed description of this protocol is provided in https://github.com/wlwhite-tufts/code-for-SMART-MHC-manuscript.
Peptide binding affinity measurements
Three technical replicates of varying concentrations of SMART MHC were mixed with a constant concentration of fluorophore-labeled peptide (300pM FITC-gp33 for H-2D^b^ and 10 nM AF488-NY-ESO-1 for A*02:01) and incubated overnight at room temperature to allow equilibration. Fluorescence polarization measurements of these samples were made with a Synergy Neo2 plate reader (BioTek instruments) with a 485/530 FP filter. Binding curves (using the non-simplified equilibrium binding equation^67^) were fitted separately to each of the triplicate measurements and averaged to determine the K_D_. The peptides used in these experiments were synthesized in-house using the methods described above.
Circular dichroism measurements
Purified soluble SMART H-2D^b^ was diluted to 0.3 mg/mL (9.0 μM) in 25 mM Tris (pH 8.0), 150 mM NaCl, and 5% glycerol. The gp33 peptide stock was prepared by dissolving dry peptide (purchased from GenScript as a custom synthesis) to 2 mg/mL (1.52 mM) in methanol. Two-fold molar excess of peptide stock (or an equivalent volume of methanol) was added to the protein sample and incubated at 4°C overnight. CD spectra were measured using a Jasco J-1500, and melting curves were measured in increments of 0.5°C at a rate of 2°C/min.
Surface Plasmon Resonance (SPR) measurements
TCR binding affinities were measured as previously described^46^, using CSM8 H-2D^b^ as the mobile phase. All measurements were performed on a BIAcore T200 (GE Healthcare) at 20°C in HBS with 0.005% Tween-20 and 3 mM EDTA. Soluble P14–6xHis (0.75 μM) was bound to immobilized anti-His on a CM5-chip. Varied concentrations of freshly produced CSM8 H-2D^b^/peptide complexes were injected over the chip surfaces, maintained at 4°C. Chip surfaces were regenerated using a low pH buffer after each injection. The final signal was calculated by subtracting the signal obtained on the control (no TCR) surface from the signal on the TCR-coupling surface. SPR data was analyzed with BIAevaluation 3.0 software (Cytiva), K_D_ values were obtained from steady–state fitting of equilibrium binding curves from at least ten sample injections. Both A*02:01 and H-2D^b^ SPR measurements were performed using the same approach. Further details are provided in the supplemental methods.
CSM8 H-2Db Crystallography
Crystallization of CSM8 H-2D^b^/gp33 was performed using the sitting-drop vapor diffusion method at 293.15 K. Drops were set up using 0.15 μL CSM8 H-2D^b^/gp33 and 0.15 μL reservoir solution (0.2 M sodium acetate trihydrate, 0.1 M TRIS hydrochloride pH 8.5, and 30% w/v polyethylene glycol, PEG 4,000), equilibrated against 50 μL reservoir solution. Crystals appeared after 6~13 days and were cryoprotected with a solution containing an additional 7.5% w/v PEG 4,000 and harvested using mounted CryoLoops (Hampton Research). Subsequently, the crystals were flash-frozen in liquid nitrogen for transportation and data collection. Diffraction images were collected in the automatic beamline ID30 at the European Synchrotron Radiation Facility (ESRF) in Grenoble, France. Diffraction data were processed using autoPROC^68^. The crystal structure was determined by molecular replacement with Phaser-MR in PHENIX^69^ with the design model employed as the search model. Refinement was performed with PHENIX, followed by manual model building with Coot^70^, and final refinement with PHENIX^71–73^. Figures were generated using PyMOL Molecular Graphics System (Schrödinger). The final coordinates/structure factors have been deposited in the PDB with accession code 9HY4.
Peptide-fused yeast display
Yeast display of peptide-fused hit6, CSM8 and SMART A02:01 was performed as previously described^19^. In brief, 50 ng pCT or pYAL plasmids encoding corresponding full length or hit6, CSM8 and SMART A02 constructs with TAX were electroporated into competent EBY100. The EBY100 was cultured in YPD medium for 1 hour at 30°C, spun down and continued to grow in SDCAA medium for 48 hours before induction in SGCAA for 48 hours. Display levels were evaluated with fluorophore-conjugated anti-HA and anti-A02:01 (clone BB7.2) antibodies, and TCR binding was evaluated with TCR tetramers made by combining soluble biotinylated A6 TCR with fluorophore-conjugated SA.
A6c134/CSM8 A*02:01/TAX9 Crystallography
Purified A6c134 TCR was combined with equimolar refolded CSM8 A*02:01/TAX9 and purified on a Superdex S200 10/300 Increase column in HBS. Diffraction-quality crystals were grown by microseed matrix screening. Diffraction images were collected at NSLS-2 beamline 17-ID-1 at Brookhaven National Laboratory. The structure was solved by molecular replacement and refined using data to 3.4 Å resolution. Detailed crystallographic methods can be accessed in the SI Appendix. An example of electron density at the final stage of refinement is shown in Supplementary Figure S6. Data collection and refinement statistics are presented in Supplementary Table S2. Structure figures were generated using PyMOL. Structure factors and final model coordinates have been deposited in the PDB with accession code 9NDS, and diffraction images have been deposited in the SBGrid Databank.
Linker design
Linker design was used to replace the flexible GS linker used in the yeast display screening with a shorter, structured linker. Starting with the design model of the CSM8 variant, we used existing “inpainting” methods^54^ to fill in a small segment of protein structure, bridging the gap between the C-terminus of the stabilizer and N-terminus of the MHC. Sequences for the resulting protein backbones were designed using ProteinMPNN^55^, restricted to changing only the amino acids in the “inpainted” structure. Finally, the resulting designs were evaluated using AlphaFold2^56^ predictions with the MHC structure provided as a template. Designs with high overall pLDDT scores and low PAE scores for residues in the MHC/stabilizer interface were selected for experimental testing. A detailed description of this protocol is provided in https://github.com/wlwhite-tufts/code-for-SMART-MHC-manuscript.
Small-scale expression and SEC of HLA Alleles
Genes encoding CSM8-L11 HLAs were cloned into expression vector LM0627 (Addgene 191551) containing the MSG residues at the N-terminus and a SNAC linked to 6xHis tag at the C-terminus, and transformed into E. coli. Cells were grown at 37°C to an OD_600_ of 1 in TB-II with Kanamycin, induced with IPTG, and moved to 16°C for overnight expression. Cells were pelleted and lysed with BugBuster^®^, and insoluble material was removed by centrifugation. Soluble lysate was purified in a 96-well plate using Ni-NTA affinity chromatography, filtered, and analyzed using HPLC (Agilent) with an S75 column (Cytiva).
T-cell staining
OT-I TCR Jurkat cells were stained with CFSE and washed. OT-I and P14 Jurkat lines were mixed in equal numbers and resuspended to 1M cells/mL in 100uL of staining solution containing either a native MHC tetramer (PE/H-2K^b^/OVA or APC/H-2D^b^/gp33) or CSM8-L11 H-2D^b^/peptide oligomer and incubated at 4°C for 30min, and washed. Oligomer-treated cells were then stained with AF647-anti-Myc, washed and resuspended in HBH for analysis on an Attune Nxt flow cytometer. Further details are provided in the supplemental methods.
Supplementary Material
supplement
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Wooldridge L Tricks with tetramers: how to get the most from multimeric peptide–MHC. Immunology 126, 147–164 (2009).19125886 10.1111/j.1365-2567.2008.02848.x PMC 2632693 · doi ↗ · pubmed ↗
- 2Sibener LV Isolation of a Structural Mechanism for Uncoupling T Cell Receptor Signaling from Peptide-MHC Binding. Cell 174, 672–687.e 27 (2018).30053426 10.1016/j.cell.2018.06.017PMC 6140336 · doi ↗ · pubmed ↗
- 3Pettmann J The discriminatory power of the T cell receptor. e Life 10, e 67092 (2021).34030769 10.7554/e Life.67092 PMC 8219380 · doi ↗ · pubmed ↗
- 4Linnemann C High-throughput identification of antigen-specific TC Rs by TCR gene capture. Nat. Med. 19, 1534–1541 (2013).24121928 10.1038/nm.3359 · doi ↗ · pubmed ↗
- 5Murali-Krishna K Counting Antigen-Specific CD 8 T Cells: A Reevaluation of Bystander Activation during Viral Infection. Immunity 8, 177–187 (1998).9491999 10.1016/s 1074-7613(00)80470-7 · doi ↗ · pubmed ↗
- 6Kronenberg D Circulating Preproinsulin Signal Peptide–Specific CD 8 T Cells Restricted by the Susceptibility Molecule HLA-A 24 Are Expanded at Onset of Type 1 Diabetes and Kill β-Cells. Diabetes 61, 1752–1759 (2012).22522618 10.2337/db 11-1520 PMC 3379678 · doi ↗ · pubmed ↗
- 7Rowntree LC Preferential HLA-B 27 Allorecognition Displayed by Multiple Cross-Reactive Antiviral CD 8+ T Cell Receptors. Front. Immunol. 11, (2020).10.3389/fimmu.2020.00248 PMC 704238232140156 · doi ↗ · pubmed ↗
- 8Li S Characterization of neoantigen-specific T cells in cancer resistant to immune checkpoint therapies. Proc. Natl. Acad. Sci. 118, e 2025570118 (2021).34285073 10.1073/pnas.2025570118 PMC 8325261 · doi ↗ · pubmed ↗
