# Performance comparison of artificial intelligence models in predicting 72-h emergency department unscheduled return visits

**Authors:** Lumin Fan, Xinghua Zuo, Lunxian Tang, Honglin Xiong, Yanli You, Chongjun Fan

PMC · DOI: 10.3389/fpubh.2025.1609206 · Frontiers in Public Health · 2025-12-19

## TL;DR

This study compares AI models to predict emergency department return visits within 72 hours, finding that TabNet performs best and could help improve patient care planning.

## Contribution

The study introduces a comprehensive comparison of AI models for predicting emergency department return visits, identifying TabNet as the top-performing model.

## Key findings

- TabNet achieved an AUROC of 0.867 and sensitivity of 0.809 in predicting 72-h ED return visits.
- Key predictive variables included digestive/respiratory diagnoses, age, triage classification, and ED visit frequency.
- TabNet outperformed traditional machine learning models like logistic regression and random forest.

## Abstract

Unscheduled return visits (URVs) to emergency departments (EDs) contribute significantly to healthcare burden through resource utilization and ED overcrowding. While artificial intelligence (AI) methodologies show potential in URV prediction, existing studies have employed limited algorithms with moderate performance, highlighting the need for comprehensive AI architecture comparison within unified cohorts.

This study evaluated the predictive performance of multiple AI models for 72-h ED URVs, aiming to identify optimal risk stratification strategies for improved discharge planning and targeted interventions.

This retrospective study analyzed adult internal medicine visits to the ED at a tertiary hospital. URVs were defined as ED revisits occurring within 72 h after initial ED discharge time. The dataset was partitioned into training (70%) and testing (30%) sets. Four traditional machine learning algorithms (logistic regression, support vector machine, random forest, and extreme gradient boosting) and one deep learning architecture (TabNet) were developed with Bayesian optimization for hyperparameter tuning. Model performance was assessed through comprehensive metrics including discrimination, calibration, clinical utility, and confusion matrices. The optimal model underwent feature importance analysis, systematic ablation studies, sensitivity analyses, and subgroup fairness evaluation.

Of 143,192 analyzed visits, 24,117 (16.8%) were classified as URVs. Data were allocated into training (n = 100,235) and testing (n = 42,957) sets with consistent URV proportions. TabNet demonstrated optimal discriminative performance with AUROC 0.867 (95% CI: 0.854–0.880) and sensitivity of 0.809 (95% CI: 0.801–0.816). Decision curve analysis demonstrated sustained clinical utility across threshold probabilities of 10–30%. Feature importance analysis identified initial diagnoses of digestive and respiratory system diseases, patient age, P3 triage classification, and ED visit frequency as key predictive variables. Subgroup analysis confirmed consistent performance across patient demographics and clinical characteristics.

TabNet outperformed traditional machine learning approaches in predicting 72-h ED URVs, offering potential for improved risk stratification in emergency care settings.

## Full-text entities

- **Diseases:** digestive and respiratory system diseases (MESH:D004066)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12757422/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12757422/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/PMC12757422/full.md

---
Source: https://tomesphere.com/paper/PMC12757422