# MarkerPredict: predicting clinically relevant predictive biomarkers with machine learning

**Authors:** Daniel V. Veres, Peter Csermely, Klára Schulc

PMC · DOI: 10.1038/s41540-025-00603-0 · NPJ Systems Biology and Applications · 2025-11-21

## TL;DR

This paper introduces MarkerPredict, a machine learning tool that identifies potential cancer biomarkers by analyzing protein network motifs and disorder.

## Contribution

The novel contribution is integrating network motifs and protein disorder with machine learning to predict clinically relevant biomarkers.

## Key findings

- MarkerPredict classified 3670 target-neighbour pairs with high LOOCV accuracy using 32 models.
- 2084 potential biomarkers were identified, with 426 consistently ranked by all four calculations.
- The study highlights the biomarker potential of LCK and ERK1 for targeted cancer therapies.

## Abstract

Precision oncology relies on predictive biomarkers for selecting targeted cancer therapies. Network-based properties of proteins, together with structural features such as intrinsic disorder, are likely to shape their potential as biomarkers. We therefore designed a hypothesis-generating framework that integrates network motifs and protein disorder to explore their contribution to predictive biomarker discovery. This encouraged us to develop MarkerPredict by using literature evidence-based positive and negative training sets of 880 target-interacting protein pairs total with Random Forest and XGBoost machine learning models on three signalling networks. MarkerPredict classified 3670 target-neighbour pairs with 32 different models achieving a 0.7–0.96 LOOCV accuracy. We defined a Biomarker Probability Score (BPS) as a normalised summative rank of the models. The scores identified 2084 potential predictive biomarkers to targeted cancer therapeutics, 426 was classified as a biomarker by all 4 calculations. We detailed the biomarker potential of LCK and ERK1. This study encourages further validation of the high-ranked predictive biomarkers. The development of the MarkerPredict tool (which is available on GitHub) for predictive biomarker identification may have a significant impact on clinical decision-making in oncology.

## Linked entities

- **Genes:** LCK (LCK proto-oncogene, Src family tyrosine kinase) [NCBI Gene 3932], MAPK3 (mitogen-activated protein kinase 3) [NCBI Gene 5595]
- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** MAPK3 (mitogen-activated protein kinase 3) [NCBI Gene 5595] {aka ERK-1, ERK1, ERT2, HS44KDAP, HUMKER1A, P44ERK1}, LCK (LCK proto-oncogene, Src family tyrosine kinase) [NCBI Gene 3932] {aka IMD22, LSK, YT16, p56lck, pp58lck}
- **Diseases:** cancer (MESH:D009369)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12638940/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12638940/full.md

## References

6 references — full list in the complete paper: https://tomesphere.com/paper/PMC12638940/full.md

---
Source: https://tomesphere.com/paper/PMC12638940