# Machine Learning-Based Prediction of No-Show Telemedicine Encounters

**Authors:** C. Mahony Reategui-Rivera, Wanting Cui, Stefan Escobar-Agreda, Leonardo Rojas-Mezarina, Joseph Finkelstein

PMC · DOI: 10.1089/tmr.2025.0009 · Telemedicine Reports · 2025-04-07

## TL;DR

This study uses machine learning to predict which patients will miss telemedicine appointments in Peru, aiming to improve healthcare access and efficiency.

## Contribution

The study introduces cost-sensitive machine learning techniques to address class imbalance in predicting telemedicine no-shows in a real-world health system.

## Key findings

- Cost-sensitive XGBoost achieved balanced performance with high specificity and accuracy in predicting no-shows.
- Key predictors of no-shows include patient demographics, socioeconomic factors, and appointment timing.
- Tailored interventions based on model insights could improve telemedicine adherence and healthcare equity.

## Abstract

This study aimed to evaluate the performance of machine learning (ML) models in predicting patient no-shows for telemedicine appointments within Peruvian health system and identify key predictors of nonattendance.

We performed a retrospective observational study using anonymized data (June 2019–November 2023) from “Teleatiendo.” The dataset included over 1.5 million completed appointments and about 64,000 no-shows (4.1%), focusing on teleorientation and telemonitoring. Predictor variables included patient demographics, socioeconomic factors, health care facility characteristics, appointment timing, and telemedicine service types. A 70% training, 10% validation, and 20% testing split were used over 10 iterations, with hyperparameter tuning performed on the validation set to identify optimal model parameters. Multiple ML approaches—random forest, XGBoost, LightGBM, and anomaly detection—were implemented in combination with undersampling and cost-sensitive learning to address class imbalance. Performance was evaluated using precision, recall, specificity, area under the curve (AUC), F1-score, and accuracy.

Of the models tested, undersampling with XGBoost achieved a precision of 0.115 (±0.001), recall of 0.654 (±0.005), specificity of 0.786 (±0.002), AUC of 0.720 (±0.002), and accuracy of 0.780 (±0.002). In contrast, cost-sensitive XGBoost exhibited a balanced performance with a precision of 0.123 (±0.001), recall of 0.639 (±0.006), specificity of 0.805 (±0.004), AUC of 0.722 (±0.001), and accuracy of 0.799 (±0.003). Additionally, cost-sensitive random forest achieved the highest specificity (0.843 ± 0.002) and accuracy (0.832 ± 0.001) but recorded a lower recall (0.585 ± 0.004), while cost-sensitive LightGBM and balanced random forest yielded performance metrics similar to cost-sensitive XGBoost. Isolation forest, used for abnormality detection, demonstrated the lowest performance.

ML models can moderately predict telemedicine no-shows in Peru, with cost-sensitive boosting techniques enhancing the identification of high-risk patients. Key predictors reflect both individual behavior and system-level contexts, suggesting the need for tailored, context-specific interventions. These findings can inform targeted strategies to optimize telemedicine, improve appointment adherence, and promote equitable health care access.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12235123/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12235123/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC12235123/full.md

---
Source: https://tomesphere.com/paper/PMC12235123