Calibrated Abstention for Reliable TCR--pMHC Binding Prediction under Epitope Shift
Arman Bekov, Timur Bekzhanov, Bekzat Sadykov

TL;DR
This paper introduces a calibrated abstention framework for TCR--pMHC binding prediction, enabling models to abstain on uncertain cases for more reliable predictions, especially on unseen epitopes.
Contribution
It proposes a dual-encoder architecture with calibration and conformal abstention to improve prediction reliability under epitope shift.
Findings
Achieves AUROC 0.813 and ECE 0.043 on challenging splits.
Reduces ECE by 69.7% compared to uncalibrated baseline.
At 80% coverage, error rate drops from 18.7% to 10.9%.
Abstract
Predicting T-cell receptor (TCR)--peptide-MHC (pMHC) binding is central to vaccine design and T-cell therapy, yet deployed models frequently encounter epitopes unseen during training, causing silent overconfidence and unreliable prioritization. We address this by framing TCR--pMHC prediction as a \emph{selective prediction} problem: a calibrated model should either output a trustworthy confidence score or explicitly abstain. Concretely, we (1) introduce a dual-encoder architecture encoding both CDR3/CDR3 and peptide sequences via a pre-trained protein language model; (2) apply temperature scaling to correct systematic probability miscalibration; and (3) impose a conformal abstention rule that provides finite-sample coverage guarantees at a user-specified target error rate. Evaluated under three split strategies -- random, epitope-held-out, and distance-aware -- our method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
