Interpreting BERT architecture predictions for peptide presentation by MHC class I proteins
Hans-Christof Gasser, Georges Bedran, Bo Ren, David Goodlett, Javier, Alfaro, Ajitha Rajan

TL;DR
This paper introduces ImmunoBERT, a BERT-based model for predicting peptide presentation by MHC class I proteins, and applies interpretability techniques to understand the model's decision factors, aligning with biological insights.
Contribution
The study presents a novel BERT-based model for MHC I peptide presentation prediction and demonstrates the use of SHAP and LIME interpretability methods in this domain.
Findings
Amino acids near peptide terminals are highly influential.
Certain MHC positions (A, B, F pockets) are key importance factors.
Model predictions align with known biological structures.
Abstract
The major histocompatibility complex (MHC) class-I pathway supports the detection of cancer and viruses by the immune system. It presents parts of proteins (peptides) from inside a cell on its membrane surface enabling visiting immune cells that detect non-self peptides to terminate the cell. The ability to predict whether a peptide will get presented on MHC Class I molecules helps in designing vaccines so they can activate the immune system to destroy the invading disease protein. We designed a prediction model using a BERT-based architecture (ImmunoBERT) that takes as input a peptide and its surrounding regions (N and C-terminals) along with a set of MHC class I (MHC-I) molecules. We present a novel application of well known interpretability techniques, SHAP and LIME, to this domain and we use these results along with 3D structure visualizations and amino acid frequencies to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsvaccines and immunoinformatics approaches · Influenza Virus Research Studies · RNA and protein synthesis mechanisms
MethodsShapley Additive Explanations · Local Interpretable Model-Agnostic Explanations
