TCR2HLA: Calibrated inference of HLA genotypes from TCR repertoires enables identification of immunologically relevant metaclonotypes
Koshlan Mayer-Blackwell, Anastasia Minervina, Mikhail Pogorelyy, Puneet Rawat, Melanie R. Shapiro, Leeana D. Peters, Emily S. Ford, Amanda L. Posgai, Kasi Vegesana, Samuel Minot, David M. Koelle, Victor Greiff, Philip Bradley, Todd M. Brusko, Paul G. Thomas, Andrew Fiore-Gartland

TL;DR
TCR2HLA is a tool that predicts HLA genotypes from TCR repertoires, enabling discovery of TCRs linked to specific HLA alleles and infections like SARS-CoV-2.
Contribution
Introduces TCR2HLA, a calibrated framework for inferring HLA genotypes from TCR repertoires and identifying immunologically relevant TCRs.
Findings
TCR2HLA achieved high accuracy in predicting HLA alleles from TCRβ sequences across multiple datasets.
Identified ~96,000 TCRβ features strongly associated with specific HLA alleles from 71M input TCRs.
Enabled discovery of SARS-CoV-2 related TCRs in a dataset lacking HLA data.
Abstract
T cell receptors (TCRs) recognize peptides presented by polymorphic human leukocyte antigen (HLA) molecules, but HLA genotype data are often missing from TCR repertoire sequencing studies. To address this, we developed TCR2HLA, an open-source tool that infers HLA genotypes from TCRβ repertoires. Expanding on work linking public TRBV-CDR3 sequences to HLA genotypes, we incorporated “quasi-public” metaclonotypes – composed of rarer TCRβ sequences with shared amino acid features – enriched by HLA genotypes. Using four TCRβseq datasets from 3,150 individuals, we applied TRBV gene partitioning and locality-sensitive hashing to identify ~96,000 TCRβ features strongly associated with specific HLA alleles from 71M input TCRs. Binary HLA classifiers built with these features achieved high balanced accuracy (>0.9) across common HLA-A (9/12), B (9/12), C (6/13), DRB1 (11/11) alleles and prevalent…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsT-cell and B-cell Immunology · vaccines and immunoinformatics approaches · Cancer Immunotherapy and Biomarkers
