# Predicting breast cancer prognosis based on a novel pathomics model through CHEK1 expression analysis using machine learning algorithms

**Authors:** Chen Chen, Dan Gao, Huan Yue, Huijing Wang, Rui Qu, Xiaochi Hu, Libo Luo

PMC · DOI: 10.1371/journal.pone.0321717 · PLOS One · 2025-05-09

## TL;DR

This study uses machine learning to predict breast cancer prognosis based on CHEK1 gene expression, showing that a novel pathomics model can help guide treatment decisions.

## Contribution

A novel pathomics model is developed to predict CHEK1 expression and prognosis in breast cancer using machine learning.

## Key findings

- A pathomics model reliably predicted CHEK1 overexpression and correlated with survival outcomes in breast cancer patients.
- High pathomics scores were associated with poorer prognosis and better response to anti-PD-1 and anti-CTLA4 treatments.
- Validation using tissue microarrays confirmed the model's ability to predict CHEK1 expression and prognosis.

## Abstract

Checkpoint kinase 1 (CHEK1) is often overexpressed in solid tumors. Nonetheless, the prognostic significance of CHEK1 in breast cancer (BrC) remains unclear. This study used pathomics leverages machine learning to predict BrC prognosis based on CHEK1 gene expression..

Initially, hematoxylin-eosin (H&E)-stained images obtained from The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) were segmented using Otsu’s method. Further, the sub-image features were extracted using machine learning algorithms based on PyRadiomics, mRMRe, and Gradient Boosting Machine (GBM). The predicted CHEK1 expression levels were represented as the pathomics score (PS) and validated using the corresponding RNA-seq data. The prognostic significance of both CHEK1 and PS was evaluated using Kaplan-Meier (KM), and univariate and multivariate Cox regression. The model was assessed by comparing CHEK1 expression by immunohistochemistry (IHC) with PS in BrC tissue microarray (TMA).

A 633 × 10 sub-image set was eligible for training and a 158 × 10 set for validation. 1,488 features were extracted and 8 recursive feature elimination (RFE)-screened features were used to generate the model. A high PS was associated with CHEK1 overexpression, significantly correlating with survival outcomes, especially within 96 months post-diagnosis. Further, patients with high PS responded to anti-programmed cell death protein 1 (anti-PD-1) and anti-cytotoxic T lymphocyte antigen-4 (anti-CTLA4) treatments. In TMA validation, the IHC analysis estimated that high PS similarly predicted poorer prognosis and correlated with higher CHEK1 expression.

The novel pathomics model reliably predicted CHEK1 expression using machine learning algorithms, which might provide potential clinical utility for prognosis and treatment guidance.

## Linked entities

- **Genes:** CHEK1 (checkpoint kinase 1) [NCBI Gene 1111]
- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Genes:** CTLA4 (cytotoxic T-lymphocyte associated protein 4) [NCBI Gene 1493] {aka ALPS5, CD, CD152, CELIAC3, CTLA-4, GRD4}, CHEK1 (checkpoint kinase 1) [NCBI Gene 1111] {aka CHK1, OZEMA21}, PDCD1 (programmed cell death 1) [NCBI Gene 5133] {aka ADMIO4, AIMTBS, CD279, PD-1, PD1, SLEB2}
- **Diseases:** BRCA (MESH:D001941), BrC (MESH:D001943), Cancer (MESH:D009369)
- **Chemicals:** H&amp;E (-), eosin (MESH:D004801), hematoxylin (MESH:D006416)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12064205/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12064205/full.md

## References

74 references — full list in the complete paper: https://tomesphere.com/paper/PMC12064205/full.md

---
Source: https://tomesphere.com/paper/PMC12064205