# Predicting Sprint Potential: A Machine Learning Model Based on Blood Metabolite Profiles in Young Male Athletes

**Authors:** Jingfeng Chen, Yuhang Qian, Yuansheng Xu

PMC · DOI: 10.1002/ejsc.12272 · European Journal of Sport Science · 2025-02-24

## TL;DR

This study uses blood metabolites to distinguish athletes from non-athletes and predict sprint performance in young males.

## Contribution

Identifies two blood metabolites that effectively predict athletic potential and differentiate athletes from healthy individuals.

## Key findings

- HMDB0012085 and HMDB0009224 are significantly elevated in athletes compared to healthy individuals.
- These metabolites show strong predictive power for sprint performance in 100, 200, and 400 m races.
- Machine learning models using these metabolites effectively classify athletes versus non-athletes.

## Abstract

This study aims to utilize male blood metabolite signatures for (i) distinguishing between healthy individuals and athletes, thereby optimizing the athlete screening process; and (ii) predicting athletic performance in 100, 200, and 400 m sprints, enhancing precompetition preparation and intervention strategies. Initially, we employed nontargeted metabolomics to analyze the blood metabolome of healthy individuals (n = 10) and athletes (n = 10), identifying differential expressed metabolites (DEMs) potentially related to athletic performance through differential analysis, consensus clustering, WGCNA, and UMAP analysis. Subsequently, using LASSO‐Cox analysis, we refined our selection to two core DEMs: HMDB0012085 (Sphingomyelin (d18:0/14:0)) and HMDB0009224 (Phosphatidylethanolamine(20:0/18:1(9Z))) associated with athletic performance. We then applied targeted metabolomics to measure the levels of these DEMs in a larger cohort, including healthy individuals (n = 50) and athletes (n = 100), revealing a significant increase in the levels of HMDB0012085 and HMDB0009224 in athletes compared to healthy individuals. Utilizing 13 machine learning classification methods, we demonstrated that the levels of HMDB0012085 and HMDB0009224 in blood effectively differentiate between healthy individuals and athletes. Notably, HMDB0012085 exhibits greater feature importance across multiple algorithms compared to HMDB0009224. Specifically, in decision trees (94.1 vs. 5.9), random forests (60.7 vs. 39.3), gradient boosting trees (91.5 vs. 8.5), CatBoost (61.7 vs. 38.3), ExtraTrees (64.7 vs. 35.3), and XGBoost (74.5 vs. 25.5). Finally, we found a significant negative correlation between the levels of HMDB0012085 and HMDB0009224 in whole blood and sprint times for 100, 200, and 400 m races. In conclusion, HMDB0012085 and HMDB0009224 in whole blood hold promise as biomarkers for predicting athletic potential in males.

## Linked entities

- **Chemicals:** Sphingomyelin (d18:0/14:0) (PubChem CID 44260138)

## Full-text entities

- **Diseases:** injuries (MESH:D014947), muscle fiber damage (MESH:C563545), inflammation (MESH:D007249), muscle damage (MESH:D009133), DEMs (MESH:D001039), fatigue (MESH:D005221)
- **Chemicals:** glucose (MESH:D005947), sucrose (MESH:D013395), phospholipid (MESH:D010743), amino acids (MESH:D000596), Sphingomyelin (MESH:D013109), starch (MESH:D013213), diacylglycerol (MESH:D004075), creatine (MESH:D003401), Phosphatidylethanolamine (MESH:C483858), sphingolipid (MESH:D013107), phosphorylethanolamine (MESH:C005448), ATP (MESH:D000255), Lactic acid (MESH:D019344), GPEtn (-), pentose phosphate (MESH:D010428), glycerophospholipid (MESH:D020404), fatty acids (MESH:D005227), sphingosine (MESH:D013110), ceramide (MESH:D002518), triacylglycerol (MESH:D014280), Carbohydrates (MESH:D002241)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** S2 — Drosophila melanogaster (Fruit fly), Spontaneously immortalized cell line (CVCL_Z232)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11849406/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11849406/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC11849406/full.md

---
Source: https://tomesphere.com/paper/PMC11849406