# Predicting five-year comorbid bipolar disorder after attention-deficit/hyperactivity disorder diagnosis: a population-based machine learning approach

**Authors:** Yen-Shan Yang, Chih-Wei Hsu, Liang-Jen Wang, Kuo-Chuan Hung, Yang-Chieh Brian Chen, Chih-Sung Liang, Mu-Hong Chen

PMC · DOI: 10.1186/s13034-025-01002-3 · 2025-12-01

## TL;DR

This study uses machine learning to predict which ADHD patients are at risk of developing bipolar disorder within five years, using healthcare data from Taiwan.

## Contribution

A novel machine learning model was developed to predict comorbid bipolar disorder in ADHD patients using real-world health data.

## Key findings

- The model achieved a high ROC-AUC of 0.90 in predicting BD comorbidity after ADHD diagnosis.
- Key predictors included older age at ADHD onset, medication patterns, and changes in psychiatric visit frequency.
- Protective factors included having offspring with schizophrenia-spectrum disorders and fewer respiratory infections post-diagnosis.

## Abstract

Early detection and accurate prediction of bipolar disorders (BDs) comorbidity in individuals with attention-deficit/hyperactivity disorder (ADHD) are clinically critical. This study used machine-learning methods to identify features predictive of subsequent BD among patients initially diagnosed with ADHD.

We analyzed claims from the Taiwan National Health Insurance Research Database (2000–2013) and included patients aged ≥ 12 years with at least two diagnoses of ADHD. Predictor features included demographics (sex, age at ADHD onset), healthcare utilization (psychiatric outpatient visit counts), comorbidities (International Classification of Diseases–coded diagnoses), psychiatric medications (Anatomical Therapeutic Chemical–coded prescriptions), and family psychiatric history. All features were extracted from prespecified windows around the ADHD diagnosis date (index date). The primary outcome was a subsequent BD diagnosis. We trained an extreme gradient boosting (XGBoost) classifier and tuned hyperparameters via grid search to maximize the area under the receiver operating characteristic curve (AUROC). Feature importance was interpreted with Shapley additive explanations (SHAP).

Among 15,093 eligible patients, 266 (2%) developed BD during follow-up. The model achieved a ROC-AUC of 0.90 and a precision–recall AUC of 0.59; accuracy was 98%, specificity 99%, sensitivity 50%, and positive predictive value 43%. Twelve leading predictors emerged. The strongest behavioral signal was sparse psychiatric visits before ADHD diagnosis followed by frequent visits afterward (SHAP = 0.27 and 0.66, respectively). Core demographic risks were older age at ADHD onset (SHAP = 0.26) and male sex (SHAP = 0.08). Medication pattern included pre-diagnosis short-acting benzodiazepines (SHAP = 0.07) and post-diagnosis exposure to anticonvulsant mood stabilizers (SHAP = 0.34), “-dones” (SHAP = 0.06) and “-pines” (SHAP = 0.05) antipsychotics, selective serotonin-reuptake inhibitors (SHAP = 0.06), and Z-drugs (SHAP = 0.05). Protective features were having offspring with schizophrenia-spectrum disorders (SHAP = 0.11) and fewer new-onset upper-respiratory infections after ADHD diagnosis (SHAP = 0.06).

Leveraging nationwide real-world data, we built a machine-learning model to predict subsequent comorbid BD in patients with ADHD. The identified clinical and medication prescribing profiles can alert clinicians to patients at heightened risk, facilitating earlier monitoring and timely intervention.

The online version contains supplementary material available at 10.1186/s13034-025-01002-3.

## Linked entities

- **Diseases:** attention-deficit/hyperactivity disorder (MONDO:0007743), upper-respiratory infections (MONDO:0024355)

## Full-text entities

- **Diseases:** bipolar disorder (MESH:D001714), attention-deficit/hyperactivity disorder (MESH:D001289)

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12776958/full.md

---
Source: https://tomesphere.com/paper/PMC12776958