# Comparative evaluation of machine learning models for enhancing diagnostic accuracy of otitis media with effusion in children with adenoid hypertrophy

**Authors:** Xiaote Zhang, Qiaoyi Xie, Ganggang Wu

PMC · DOI: 10.3389/fped.2025.1614495 · Frontiers in Pediatrics · 2025-06-19

## TL;DR

This study develops a machine learning model to improve the diagnosis of ear fluid in children with enlarged adenoids, using clinical and acoustic data.

## Contribution

The study introduces a Random Forest model that outperforms existing methods in diagnosing otitis media with effusion in children with adenoid hypertrophy.

## Key findings

- The Random Forest model achieved high diagnostic accuracy (AUC = 0.919) for OME in children with AH.
- The adenoid-to-nasopharyngeal ratio was identified as the most important predictive variable by SHAP analysis.
- The model demonstrated strong clinical utility and inter-rater agreement (Cohen's kappa = 0.696).

## Abstract

Otitis media with effusion (OME) affects a significant proportion of children with adenoid hypertrophy (AH) and can lead to developmental sequelae when chronic. Current non-invasive screening modalities rely predominantly on acoustic immittance measurements, which demonstrate variable diagnostic performance. Given the urgent need for improved diagnostic methods and extensive characterization of risk factors for OME in AH children, developing diagnostic models represents an efficient strategy to enhance clinical identification accuracy in practice.

This study aims to develop and validate an optimal machine learning (ML)-based prediction model for OME in AH children by comparing multiple algorithmic approaches, integrating clinical indicators with acoustic measurements into a widely applicable diagnostic tool.

A retrospective analysis was conducted on 847 pediatric patients with AH. Five ML algorithms were developed to identify OME using demographic, clinical, laboratory, and acoustic immittance parameters. The dataset underwent 7:3 stratified partitioning for training and testing cohorts. Within the training cohort, models were initially optimized through randomized grid search with 5-fold cross-validation, followed by comprehensive training with optimized parameters. Model performance was evaluated in the testing cohort using discrimination, calibration, clinical utility metrics, and confusion matrix-derived statistics. The optimal ML model was subsequently analyzed through SHapley Additive exPlanations (SHAP) methodology for interpretability, with sequential ablation testing performed to identify critical predictive variables.

Among 847 children with AH, 262 (30.9%) were diagnosed with OME. The Random Forest (RF) model demonstrated superior performance with the highest discrimination (area under the receiver operating characteristic curve = 0.919), balanced calibration (Brier score = 0.102), and optimal clinical utility across decision thresholds of 0.4–0.6. Confusion matrix analysis further confirmed RF as the optimal model, achieving 0.875 accuracy and robust inter-rater agreement (Cohen's kappa coefficient = 0.696) in the testing cohort. SHAP analysis identified the adenoid-to-nasopharyngeal ratio as the predominant diagnostic indicator, followed by tympanometric type and history of recurrent respiratory infections.

An RF-based diagnostic model effectively identifies OME in AH children by integrating anatomical, functional, and inflammatory parameters, providing a clinically applicable tool for enhanced diagnostic accuracy and evidence-based management decisions.

## Linked entities

- **Diseases:** otitis media with effusion (MONDO:0005892), adenoid hypertrophy (MONDO:0000740)

## Full-text entities

- **Diseases:** OME (MESH:D010034), respiratory infections (MESH:D012141), inflammatory (MESH:D007249), AH (MESH:D006984)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12222205/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12222205/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC12222205/full.md

---
Source: https://tomesphere.com/paper/PMC12222205