# Construction and validation of a machine learning-based nomogram model for predicting pneumonia risk in patients with catatonia: a retrospective observational study

**Authors:** Yi-chao Wang, Qian He, Yue-jing Wu, Li Zhang, Sha Wu, Xiao-jia Fang, Shao-shen Jia, Fu-gang Luo

PMC · DOI: 10.3389/fpsyt.2025.1557659 · 2025-03-14

## TL;DR

A machine learning model was developed to predict pneumonia risk in hospitalized catatonia patients based on pre-admission factors.

## Contribution

A novel nomogram model using machine learning was created to predict pneumonia risk in catatonia patients based on pre-admission features.

## Key findings

- The Gradient Boosting Machine model achieved the highest AUC of 0.954 for predicting pneumonia risk.
- Five key variables (Age, Clozapine, Diaphoresis, Intake Refusal, and Waxy Flexibility) were identified as significant predictors.
- The nomogram model showed good discrimination and calibration with an AUC of 0.803 in validation.

## Abstract

Catatonia was often complicated by pneumonia, and the development of severe pneumonia after admission posed significant challenges to its treatment. This study aimed to develop a Nomogram Model based on pre-admission characteristics of patients with catatonia to predict the risk of pneumonia after admission.

This retrospective observational study reviewed catatonia patients hospitalized at Hangzhou Seventh People’s Hospital from September 2019 to November 2024. Data included demographic characteristics, medical history, maintenance medications, and pre-admission clinical presentations. Patients were divided into catatonia with and without pneumonia groups. The LASSO Algorithm was used for feature selection, and seven machine learning models: Decision Tree(DT), Logistic Regression(LR), Naive Bayes(NB), Random Forest(RF), K Nearest Neighbors(KNN), Gradient Boosting Machine(GBM), Support Vector Machine(SVM) were trained. Model performance was evaluated using AUC, Accuracy, Sensitivity, Specificity, Positive Predictive Value, Negative Predictive Value, F1 Score, Cohen’s Kappa, and Brier Score, and Brier score. The best-performing model was selected for multivariable analysis to determine the variables included in the final Nomogram Model. The Nomogram Model was further validated through ROC Curves, Calibration Curves, Decision Curve Analysis (DCA), and Bootstrapping to ensure discrimination, calibration, and clinical applicability.

Among 156 patients, 79 had no pneumonia, and 77 had pneumonia. LASSO Algorithm identified 15 non-zero coefficient variables (LASSO 1-SEλ=0.076). The GBM showed the best performance (AUC = 0.954, 95% CI: 0.924-0.983, vs other models by DeLong’s test: P < 0.05). Five key variables: Age, Clozapine, Diaphoresis, Intake Refusal, and Waxy Flexibility were used to construct the Nomogram Model. Validation showed good discrimination (AUC = 0.803, 95% CI: 0.735-0.870), calibration, and clinical applicability. Internal validation (Bootstrapping, n=500) confirmed model stability (AUC = 0.814, 95% CI: 0.743-0.878; Hosmer-Lemeshow P = 0.525).

This study developed a Nomogram Model based on five key factors, demonstrating significant clinical value in predicting the risk of pneumonia in hospitalized patients with catatonia.

## Linked entities

- **Chemicals:** Clozapine (PubChem CID 135398737)
- **Diseases:** pneumonia (MONDO:0005249), catatonia (MONDO:0800105)

## Full-text entities

- **Diseases:** pneumonia (MESH:D011014), Catatonia (MESH:D002389)
- **Chemicals:** Clozapine (MESH:D003024)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

9 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11951867/full.md

---
Source: https://tomesphere.com/paper/PMC11951867