Counterfactual Peptide Editing for Causal TCR--pMHC Binding Inference

Sanjar Khudoyberdiev; Arman Bekov

arXiv:2604.13256·cs.LG·April 16, 2026

Counterfactual Peptide Editing for Causal TCR--pMHC Binding Inference

Sanjar Khudoyberdiev, Arman Bekov

PDF

TL;DR

This paper introduces CIP, a training framework that improves TCR-pMHC binding prediction by generating biologically constrained counterfactual peptide edits to reduce shortcut learning and enhance out-of-distribution robustness.

Contribution

CIP is a novel training method that enforces invariance to non-anchor peptide positions and sensitivity to anchor residues, improving causal TCR specificity modeling.

Findings

01

CIP achieves AUROC 0.831 on a challenging benchmark.

02

CIP reduces shortcut index by 39.7%.

03

Anchor-aware edits drive out-of-distribution gains.

Abstract

Neural models for TCR-pMHC binding prediction are susceptible to shortcut learning: they exploit spurious correlations in training data -- such as peptide length bias or V-gene co-occurrence -- rather than the physical binding interface. This renders predictions brittle under family-held-out and distance-aware evaluation, where such shortcuts do not transfer. We introduce \emph{Counterfactual Invariant Prediction} (CIP), a training framework that generates biologically constrained counterfactual peptide edits and enforces invariance to edits at non-anchor positions while amplifying sensitivity at MHC anchor residues. CIP augments the base classifier with two auxiliary objectives: (1) an invariance loss penalizing prediction changes under conservative non-anchor substitutions, and (2) a contrastive loss encouraging large prediction changes under anchor-position disruptions. Evaluated on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.