# SiaScoreNet: a siamese neural network-based model integrating prediction scores for HLA-peptide interaction prediction

**Authors:** Mahsa Saadat, Fatemeh Zare-Mirakabad, Milad Besharatifard

PMC · DOI: 10.1093/bioadv/vbaf248 · Bioinformatics Advances · 2025-11-19

## TL;DR

SiaScoreNet is a new model that improves predictions of HLA-peptide interactions, which is important for cancer immunotherapy.

## Contribution

SiaScoreNet introduces a novel three-step pipeline using a siamese neural network and nonlinear ensemble strategy for HLA-peptide interaction prediction.

## Key findings

- SiaScoreNet outperforms existing models in accuracy and runtime efficiency.
- The model integrates scores from state-of-the-art predictors using a nonlinear strategy.
- Benchmark results show SiaScoreNet performs well compared to TransPHLA, BigMHC, and CapHLA.

## Abstract

Cancer immunotherapy uses the immune system to recognize and eliminate tumor cells by presenting tumor antigens through Human Leukocyte Antigen (HLA) molecules. Accurate prediction of HLA–peptide interactions is essential for personalized immunotherapy development. Allele-specific models achieve high accuracy and handle variable peptide lengths but require separate training for each allele, limiting scalability to rare or unseen HLAs. Pan-specific models generalize across multiple alleles and match or surpass allele-specific methods. Ensemble methods improve prediction by combining outputs from multiple predictors, often via linear combinations, though nonlinear strategies may better capture HLA–peptide complexities.

We propose SiaScoreNet, a three-step predictive pipeline enhancing HLA–peptide interaction prediction. First, ESM, a pretrained transformer-based protein language model, embeds HLA and peptide sequences into fixed-length representations, accommodating varying sequence lengths. Second, we integrate predicted scores from state-of-the-art models into a comprehensive feature vector. Third, a nonlinear ensemble strategy combines features, capturing complex dependencies and boosting performance.

Benchmark evaluations show SiaScoreNet outperforms existing models in accuracy, comparable to TransPHLA, BigMHC, and CapHLA. Recent models prioritize recall over precision, valuable for identifying potential binders but resource-intensive. SiaScoreNet offers improved performance and runtime efficiency compared to these models, evaluated against HPV viruses for HLA–peptide prediction.

The data and source code for prediction and experiments presented in this study is publicly available in the SiaScoreNet repository hosted on GitHub: https://github.com/CBRC-lab/SiaScoreNet.

## Linked entities

- **Proteins:** Esm (Enhancer of split mimic)
- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** HLA-A (major histocompatibility complex, class I, A) [NCBI Gene 3105] {aka HLAA}, HLA-S (major histocompatibility complex, class I, S (pseudogene)) [NCBI Gene 267015] {aka HLA-17}
- **Diseases:** Cancer (MESH:D009369)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12641608/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12641608/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC12641608/full.md

---
Source: https://tomesphere.com/paper/PMC12641608