# Prediction of medication overuse in patients with migraine using cox regression and machine learning: a real-world cohort

**Authors:** Teerapong Aramruang, Pawin Numthavaj, Panu Looareesuwan, Thunyarat Anothaisintawee, Patratorn Kunakorntham, Oraluck Pattanaprateep, Charungthai Dejthevaporn, Ammarin Thakkinstian

PMC · DOI: 10.1186/s10194-026-02269-3 · The Journal of Headache and Pain · 2026-01-24

## TL;DR

This study uses real-world data and machine learning to predict medication overuse in migraine patients, finding that random survival forests perform slightly better than traditional models.

## Contribution

The study introduces multi-domain prediction models for medication overuse in migraine using real-world EHR data and compares machine learning with traditional methods.

## Key findings

- The RSF model outperformed CPH with a C-index of 0.645 and lower IBS of 0.193.
- Clinic type, physician position, and MO/MOH history were significant predictors across models.
- XGBoost showed lower performance compared to RSF and CPH in predicting MO/MOH.

## Abstract

Medication overuse (MO) is a critical issue for patients with migraine, contributing to chronification and medication overuse headache (MOH). Predicting those at risk is essential for effective management. This study aims to develop and compare time-to-event prediction models for MO/MOH among patients with migraine, using a cohort from electronic health records (EHRs).

A prevalent new-user design of real-world data cohort of patients with migraine conducted at Ramathibodi Hospital, Thailand, from January 2010 to December 2023. The cohort was constructed using EHR data and incorporated common predictors related to the patient, physician, and treatment. Three time-to-event models were developed: Cox proportional hazards (CPH), random survival forests (RSF), and extreme gradient boosting (XGBoost). Model performance was evaluated on a hold-out testing dataset using discrimination and calibration. Variable importance in the machine learning models was assessed using Shapley Additive Explanations.

The study included 13,082 patients with migraine, with 3,456 identified as experiencing MO/MOH, indicating an incidence rate [95% confidence interval (CI)] of 56.31 (54.44–58.21) per 1,000 patient-years. On the testing dataset, the RSF model achieved a concordance index (C-index) of 0.645 (95% CI: 0.643–0.647), slightly outperforming the CPH model’s C-index of 0.635 (95% CI: 0.634–0.636). Additionally, the RSF model recorded the lowest integrated brier score (IBS) of 0.193 (95% CI: 0.192–0.194), compared to 0.195 (95% CI: 0.194–0.196) for the CPH model. The XGBoost model demonstrated lower performance, with a C-index of 0.611 (95% CI: 0.609–0.613) and an IBS of 0.197 (95% CI: 0.195–0.199). Across all models, clinic type, physician position, and history of MO/MOH were significant predictors.

Using a real-world, EHR-derived cohort, we developed time-to-event prediction models incorporating multi-domain predictors to predict MO/MOH in patients with migraine. Although the models demonstrated only modest discrimination, their performance highlights the potential of CPH and machine learning algorithms in this context. External validation and the incorporation of additional clinical predictors, particularly those embedded in unstructured data, are needed.

The online version contains supplementary material available at 10.1186/s10194-026-02269-3.

## Linked entities

- **Diseases:** migraine (MONDO:0005277)

## Full-text entities

- **Diseases:** migraine (MESH:D008881)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12911303/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12911303/full.md

## References

3 references — full list in the complete paper: https://tomesphere.com/paper/PMC12911303/full.md

---
Source: https://tomesphere.com/paper/PMC12911303