# gdGSE: An algorithm to evaluate pathway enrichment by discretizing gene expression values

**Authors:** Jiangti Luo, Qiqi Lu, Mengjiao He, Xiaobo Zhang, Xiang Yang, Xiaosheng Wang

PMC · DOI: 10.1016/j.csbj.2025.04.038 · Computational and Structural Biotechnology Journal · 2025-05-01

## TL;DR

gdGSE is a new method for analyzing gene pathways by converting gene expression data into binary values, improving accuracy in cancer and cell type studies.

## Contribution

gdGSE introduces a novel approach using discretized gene expression to enhance pathway enrichment analysis in both bulk and single-cell data.

## Key findings

- gdGSE improves cancer stemness quantification with prognostic relevance.
- It enhances tumor subtype clustering and cell type identification.
- Pathway scores align with experimentally validated drug mechanisms in breast cancer.

## Abstract

We proposed gdGSE, a novel computational framework for gene set enrichment analysis. Unlike conventional methods that rely on continuous gene expression values, gdGSE employs discretized gene expression profiles to assess pathway activity. This approach effectively mitigates discrepancies caused by data distributions. This algorithm consists of two steps: (1) applying statistical thresholds binarizing gene expression matrix, and (2) converting the binarized gene expression matrix into a gene set enrichment matrix. Our results demonstrated that gdGSE could robustly extract biological insights from a diverse array of simulated and real bulk or single-cell gene expression datasets. Notably, gene set enrichment scores by gdGSE exhibited enhanced utility in downstream applications: (1) precise quantification of cancer stemness with significant prognostic relevance; (2) enhanced clustering performance in stratifying tumor subtypes with distinct prognoses; and (3) more accurate identification of cell types. Remarkably, the pathway activity scores by gdGSE showed > 90 % concordance with experimentally validated drug mechanisms in patients-derived xenografts and estrogen receptor-positive breast cancer cell lines. Our algorithm proposes that discretizing gene expression values provides an alternative method for evaluating pathway enrichment, applicable to both bulk and single-cell data analysis.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)

## Full-text entities

- **Diseases:** cancer (MESH:D009369)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12127574/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12127574/full.md

## References

30 references — full list in the complete paper: https://tomesphere.com/paper/PMC12127574/full.md

---
Source: https://tomesphere.com/paper/PMC12127574