# A kernel density estimation-based approach for quantifying O-GlcNAcylation dysregulation in cancer from gene expression data

**Authors:** Rastko Stojšin, Jinlian Wang, Hongfang Liu

PMC · DOI: 10.1093/bioadv/vbag045 · Bioinformatics Advances · 2026-02-13

## TL;DR

This paper introduces a new method to estimate O-GlcNAcylation dysregulation in cancer using gene expression data, enabling large-scale studies.

## Contribution

A novel nonparametric kernel density estimation approach for quantifying O-GlcNAcylation dysregulation from transcriptomic data.

## Key findings

- The method outperformed canonical metrics in simulated datasets with controlled dysregulation levels.
- In TCGA data, cancer samples had significantly lower regulation scores compared to healthy samples.
- The scores achieved accurate cancer classification (AUROC: 0.71–0.75) and generalized well to external datasets.

## Abstract

O-GlcNAcylation, a dynamic post-translational modification regulated by O-GlcNAc transferase (OGT) and O-GlcNAcase (OGA), influences critical biological processes and is dysregulated in cancers. Direct measurement of O-GlcNAcylation dysregulation is challenging due to its instability and low-throughput nature, limiting large-scale studies. However, the regulatory simplicity of this system and the availability of transcriptomic data enable inference of dysregulation from OGT and OGA expression.

We introduce a nonparametric kernel density estimation-based approach to quantify O-GlcNAcylation dysregulation using joint OGT and OGA expression. In simulated datasets with varied expression patterns and controlled dysregulation levels, our method consistently outperformed canonical metrics in quantifying dysregulation. In TCGA data from six cancer types, inferred regulation scores were significantly lower in cancer samples (0.25–0.30 vs. 0.49–0.51) and showed strong distributional differences (Kolmogorov–Smirnov P values <5.95e-11; D-statistics >0.31) compared to those from healthy samples. The scores also allow for accurate classification of cancer status (AUROC: 0.71–0.75) and generalized well to external datasets without retraining. This transcriptomics-based framework offers a scalable approach for interpretable quantification of O-GlcNAcylation dysregulation in cancer.

The code and datasets used in this study are freely available at https://github.com/wonder-ai/O-GlcNAcylation_Project under an open-source license.

## Linked entities

- **Genes:** OGT (O-linked N-acetylglucosamine (GlcNAc) transferase) [NCBI Gene 8473], OGA (O-GlcNAcase) [NCBI Gene 10724]
- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Genes:** OGA (O-GlcNAcase) [NCBI Gene 10724] {aka MEA5, MGEA5, NCOAT}, OGT (O-linked N-acetylglucosamine (GlcNAc) transferase) [NCBI Gene 8473] {aka HINCUT-1, HRNT1, MRX106, O-GLCNAC, OGT1, XLID106}
- **Diseases:** cancer (MESH:D009369)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12980335/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12980335/full.md

## References

45 references — full list in the complete paper: https://tomesphere.com/paper/PMC12980335/full.md

---
Source: https://tomesphere.com/paper/PMC12980335