# Identification of Crohn's Disease-Related Biomarkers and Pan-Cancer Analysis Based on Machine Learning

**Authors:** Tangyu Yuan, Jiayin Xing, Pengtao Liu

PMC · DOI: 10.1155/mi/6631637 · 2025-04-04

## TL;DR

This study identifies S100P and S100A8 as potential biomarkers for Crohn's disease and finds their relevance in liver and lung cancer prognosis.

## Contribution

The study introduces a machine learning-based approach to identify CD biomarkers with pan-cancer relevance, particularly highlighting S100P.

## Key findings

- S100P and S100A8 are validated as key biomarkers for Crohn's disease diagnosis.
- S100P is significantly associated with immune infiltration and survival in liver and lung cancers.
- The findings suggest a link between chronic inflammation in CD and elevated cancer risk.

## Abstract

Background: In recent years, the incidence of Crohn's disease (CD) has shown a significant global increase, with numerous studies demonstrating its correlation with various cancers. This study aims to identify novel biomarkers for diagnosing CD and explore their potential applications in pan-cancer analysis.

Methods: Gene expression profiles were retrieved from the Gene Expression Omnibus (GEO) database, and differentially expressed genes (DEGs) were identified using the “limma” R package. Key biomarkers were selected through an integrative machine learning pipeline combining LASSO regression, neural network modeling, and Support Vector Machine-Recursive Feature Elimination (SVM-RFE). Six hub genes were identified and further validated using the independent dataset GSE169568. To assess the broader relevance of these biomarkers, a standardized pan-cancer dataset from the UCSC database was analyzed to evaluate their associations with 33 cancer types.

Results: Among the identified biomarkers, S100 calcium binding protein P (S100P) and S100 calcium binding protein A8 (S100A8) emerged as key candidates for CD diagnosis, with strong validation in the independent dataset. Notably, S100P displayed significant associations with immune cell infiltration and patient survival outcomes in both liver and lung cancers. These findings suggest that chronic inflammation and immune imbalances in CD may not only contribute to disease progression but also elevate cancer risk. As an inflammation-associated biomarker, S100P holds particular promise for both CD diagnosis and potential cancer risk stratification, especially in liver and lung cancers.

Conclusion: Our study highlights S100P and S100A8 as potential diagnostic biomarkers for CD. Moreover, the pan-cancer analysis underscores the broader clinical relevance of S100P, offering new insights into its role in immune modulation and cancer prognosis. These findings provide a valuable foundation for future research into the shared molecular pathways linking chronic inflammatory diseases and cancer development.

## Linked entities

- **Genes:** S100P (S100 calcium binding protein P) [NCBI Gene 6286], S100A8 (S100 calcium binding protein A8) [NCBI Gene 6279]
- **Diseases:** Crohn's disease (MONDO:0005011), liver cancer (MONDO:0002691), lung cancer (MONDO:0005138)

## Full-text entities

- **Genes:** S100A8 (S100 calcium binding protein A8) [NCBI Gene 6279] {aka 60B8AG, CAGA, CFAG, CGLA, CP-10, L1Ag}, S100P (S100 calcium binding protein P) [NCBI Gene 6286] {aka MIG9}
- **Diseases:** Cancer (MESH:D009369), chronic inflammation (MESH:D007249), liver and lung cancers (MESH:D008175), CD (MESH:D003424)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11991868/full.md

---
Source: https://tomesphere.com/paper/PMC11991868