# Development and validation of a machine learning based early warning scoring system for high altitude polycythemia

**Authors:** Yangzong Suona, Zhuoga Danzeng, Luobu Gesang, Panduo Zhuoma, Yangjin Baima, Zhuoma Pubu, Wangjie Suolang, Bai Ci, Ju Huang, Quzong Zhaxi, Binyun Liu, Rui Zhang, Quzhen Gesang, Qiangba Dingzeng, Zhuoga Baima

PMC · DOI: 10.3389/fpubh.2025.1739909 · Frontiers in Public Health · 2026-01-21

## TL;DR

A machine learning model was developed to predict high altitude polycythemia using lifestyle factors, helping health workers identify at-risk individuals in remote high-altitude regions.

## Contribution

A novel machine learning-based early warning scoring system for high altitude polycythemia using modifiable lifestyle variables.

## Key findings

- Logistic regression achieved the best performance (AUC 0.848, sensitivity 0.81, specificity 0.79).
- Key predictors included low SpO2, male sex, age ≥50, smoking, hypertension, higher BMI, and lower tea consumption.
- The model enables low-cost lifestyle interventions to reduce chronic altitude-related illnesses in high-altitude populations.

## Abstract

High-altitude polycythemia (HAPC) lacks a lifestyle-focused risk-stratification tool among lifelong high-altitude residents. Here we aimed to develop and validate a novel machine-learning predictive scoring system for HAPC using readily modifiable lifestyle variables in this population.

In a high altitude cohort (≥4,500 m, n = 1,089), 82 candidate variables were reduced to seven lifestyle predictors via LASSO, Logistic regression, XGBoost and random forest models were trained and compared (10 fold cross validation).

Logistic regression achieved the best balance (AUC 0.848, sensitivity 0.81, specificity 0.79). Low SpO2 (< 83%), male sex, age ≥50 year, smoking, hypertension, higher body mass index (BMI) and lower tea consumption were independent predictors.

This score equips frontline health workers in extremely high-altitude, resource-scarce settings to rapidly pinpoint high-risk residents and initiate low-cost lifestyle interventions, thereby curbing the incidence of chronic altitude-related illnesses, easing local medical burdens, and improving overall quality of life for native high-altitude populations.

ChiCTR2100047945.

## Full-text entities

- **Diseases:** polycythemia (MESH:D011086), HAPC (MESH:C535833), hypertension (MESH:D006973)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12868181/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12868181/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/PMC12868181/full.md

---
Source: https://tomesphere.com/paper/PMC12868181