Contrastive learning of T cell receptor representations
Yuta Nagano, Andrew Pyo, Martina Milighetti, James Henderson, John, Shawe-Taylor, Benny Chain, Andreas Tiffeau-Mayer

TL;DR
This paper introduces SCEPTR, a contrastive learning-based TCR language model that improves TCR specificity prediction by combining autocontrastive learning with masked-language modeling, achieving state-of-the-art results.
Contribution
The paper presents a novel pre-training strategy for TCR language models that enhances data efficiency and predictive performance.
Findings
SCEPTR outperforms existing protein language models in TCR specificity prediction.
Contrastive learning combined with masked-language modeling improves model performance.
Sequence alignment-based methods still outperform some pre-trained models without contrastive learning.
Abstract
Computational prediction of the interaction of T cell receptors (TCRs) and their ligands is a grand challenge in immunology. Despite advances in high-throughput assays, specificity-labelled TCR data remains sparse. In other domains, the pre-training of language models on unlabelled data has been successfully used to address data bottlenecks. However, it is unclear how to best pre-train protein language models for TCR specificity prediction. Here we introduce a TCR language model called SCEPTR (Simple Contrastive Embedding of the Primary sequence of T cell Receptors), capable of data-efficient transfer learning. Through our model, we introduce a novel pre-training strategy combining autocontrastive learning and masked-language modelling, which enables SCEPTR to achieve its state-of-the-art performance. In contrast, existing protein language models and a variant of SCEPTR pre-trained…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsT-cell and B-cell Immunology · vaccines and immunoinformatics approaches · Monoclonal and Polyclonal Antibodies Research
MethodsContrastive Learning
