# Novel machine‐learning bioinformatics reveal distinct metabolic alterations for enhanced colorectal cancer diagnosis and monitoring

**Authors:** Rui Xu, Hyein Jung, Fouad Choueiry, Shiqi Zhang, Rachel Pearlman, Heather Hampel, Ning Jin, Jieli Li, Jiangjiang Zhu

PMC · DOI: 10.1002/imo2.70003 · iMetaOmics · 2025-03-03

## TL;DR

A new machine learning pipeline called PANDA identifies metabolic changes in colorectal cancer, offering a non-invasive way to diagnose and monitor the disease.

## Contribution

The PANDA pipeline combines PLS-DA and ANN for improved CRC diagnosis and biomarker discovery.

## Key findings

- Key metabolic pathways like TCA cycle and purine metabolism are upregulated in CRC compared to healthy controls.
- Metabolic shifts correlate with tumor and lymph node stages, including increased pyruvic acid in metastatic cases.
- PANDA integrates metabolomic and transcriptomic data to enhance CRC biomarker discovery and monitoring.

## Abstract

Colorectal cancer (CRC) is the second leading cause of cancer‐related mortality in the United States when considering both men and women. Colonoscopy remains the gold standard for CRC diagnosis but is invasive, costly, and requires extensive bowel preparation and sedation. Recent advancements in high throughput “omics” technologies may offer less invasive methods for CRC diagnosis through biomarker discovery. This study introduces a novel bioinformatics pipeline, PLS‐ANN‐DA (PANDA), combining partial least squares discriminant analysis (PLS‐DA) with an advanced artificial neural network (ANN) to improve CRC diagnosis and monitor disease progression. We analyzed metabolic alterations in CRC using a metabolomics data set of 626 CRC cases and 402 healthy controls (HC). Meanwhile, complementary transcriptomic data were also analyzed and integrated to further understand CRC metabolic dysregulations. By integrating metabolomics and transcriptomics analyses and establishing the biomarker discovery pipeline PANDA, significant metabolic pathway alterations were identified between CRC patients and healthy controls, with notable upregulation of multiple pathways in CRC. Meanwhile, we observed a downregulation of specific pathways, including purine metabolism and the tricarboxylic acid (TCA) cycle, associated with advanced tumor stages. The PANDA pipeline showed promising outcomes by effectively differentiating CRC from healthy states and providing insight into metabolic shifts occurring in advanced CRC stages. Genetic mutation‐associated metabolic changes were also discovered. Overall, this method has the potential for noninvasive CRC diagnostics and may serve as a valuable tool for understanding metabolic changes in cancer progression.

The PANDA pipeline for colorectal cancer (CRC) analysis integrates metabolomic data from LC‐MS with machine learning techniques to classify CRC stages and predict biomarkers. Initially, the raw metabolomic data is processed using partial least squares discriminant analysis (PLS‐DA) to reduce dimensionality and highlight key features. These selected features are then used to train a neural network, which learns to classify CRC stages (T1–T4) and predict relevant biomarkers. This approach allows for a more accurate diagnosis, early detection, and identification of potential therapeutic targets, contributing to personalized treatment strategies for CRC patients.

Significant upregulation of key metabolic pathways, including the tricarboxylic acid (TCA) cycle, purine metabolism, and amino acid metabolism, was observed in colorectal cancer (CRC) cases compared to healthy controls.Metabolic shifts correlated with tumor (T) and lymph node (N) stages, including increased pyruvic acid levels and decreased phenol levels in metastatic cases.Purine metabolism showed distinct patterns, with upregulation in CRC but downregulation in advanced tumor stages, linked to oncogenic signaling and nutrient deprivation.The PANDA pipeline effectively integrated metabolomic and transcriptomic data, enhancing robustness in CRC biomarker discovery and progression monitoring.

Significant upregulation of key metabolic pathways, including the tricarboxylic acid (TCA) cycle, purine metabolism, and amino acid metabolism, was observed in colorectal cancer (CRC) cases compared to healthy controls.

Metabolic shifts correlated with tumor (T) and lymph node (N) stages, including increased pyruvic acid levels and decreased phenol levels in metastatic cases.

Purine metabolism showed distinct patterns, with upregulation in CRC but downregulation in advanced tumor stages, linked to oncogenic signaling and nutrient deprivation.

The PANDA pipeline effectively integrated metabolomic and transcriptomic data, enhancing robustness in CRC biomarker discovery and progression monitoring.

## Linked entities

- **Chemicals:** pyruvic acid (PubChem CID 1060), phenol (PubChem CID 996)
- **Diseases:** colorectal cancer (MONDO:0005575)

## Full-text entities

- **Diseases:** CRC (MESH:D015179), cancer (MESH:D009369)
- **Chemicals:** TCA (MESH:D014233)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12806213/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12806213/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12806213/full.md

---
Source: https://tomesphere.com/paper/PMC12806213