# Stratify severe risk in children with respiratory syncytial virus pneumonia—A retrospective study based on machine learning and SHAP interpretation

**Authors:** Jun-An Pan, Wen-Hao Yang, Chao-Fen Wu, Min Zhou, Juan Liu, Li-Na Chen

PMC · DOI: 10.3389/fped.2026.1775752 · Frontiers in Pediatrics · 2026-03-16

## TL;DR

This study uses machine learning to identify risk factors for severe RSV pneumonia in children, aiming to improve early detection and treatment.

## Contribution

The novel use of machine learning and SHAP interpretation to identify and explain key risk factors for severe RSV pneumonia in children.

## Key findings

- XGBoost model achieved high AUC values (0.949 in training, 0.818 in testing) for predicting severe RSV pneumonia.
- SHAP analysis identified fever duration, diarrhea, hemoglobin concentration, and other factors as key predictors of severe cases.

## Abstract

Respiratory syncytial virus (RSV) is the primary pathogen causing severe lower respiratory tract infections in children, imposing a significant disease burden worldwide. The clinical manifestation of respiratory syncytial virus is not highly specific, in severe cases, it may cause a severe inflammatory response in the organism, potentially resulting in mortality. Currently, early identification and risk stratification tools for severe RSV-related pneumonia remain inadequate. This study identified potential high-risk factors contributing to severe cases in children with respiratory syncytial virus pneumonia by screening variables and establishing machine learning models, aiming to achieve individualized prevention, diagnosis, and treatment for these patients.

Our study conducted variable screening through univariate analysis and multivariate logistic regression analysis. The performance of five machine learning models in the training and test sets was compared using receiver operating characteristic curves, and the XGBOOST model with the best overall performance was selected as the final model. Finally, shapley additive explanations (SHAP) was employed to quantify and perform clinically interpretable analysis on this black-box model.

Twelve key variables were identified in patients with severe respiratory syncytial virus pneumonia. XGBoost demonstrated the best overall performance, selected as the final model for the study, which achieving AUC values of 0.949 and 0.818 in the training and test sets respectively. By SHapley Additive exPlanations (SHAP), it was found that fever duration, diarrhea, hemoglobin concentration, rhinorrhea, age, neutrophil-to-lymphocyte ratio, gestational age, neutrophil count, mode of delivery, and lymphocyte count may be the most important predictive variables for children with severe RSV pneumonia.

Our findings demonstrated that prolonged fever duration, presence of diarrhea, decreased hemoglobin concentration (HGB), absence of rhinorrhea, age under 3 months (Age<3 m), and elevated neutrophil-to-lymphocyte ratio (NLR) were predictors of severe cases among children with RSV pneumonia.

## Full-text entities

- **Diseases:** respiratory tract infections (MESH:D012141), rhinorrhea (MESH:D012818), RSV pneumonia (MESH:D011014), fever (MESH:D005334), diarrhea (MESH:D003967), severe respiratory syncytial virus pneumonia (MESH:D000086382), inflammatory (MESH:D007249)
- **Species:** Homo sapiens (human, species) [taxon 9606], Respiratory syncytial virus (no rank) [taxon 12814]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13033736/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13033736/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/PMC13033736/full.md

---
Source: https://tomesphere.com/paper/PMC13033736