# Development and external validation of a diagnostic model for differentiating major depressive disorder from bipolar disorder

**Authors:** Hongxin Zheng, Xialong Cheng, Wenxin Gan, Shuyu Duan, Yizi Liu, Kun Li, Chen Su, Chenxi Xu, Yongcan Zhou, Wenwei Zhang, Runbo Wu, Yu Xie

PMC · DOI: 10.1186/s12888-026-07844-1 · 2026-01-28

## TL;DR

This study developed a machine learning model to help distinguish between major depressive disorder and bipolar disorder using electronic medical records, aiming to improve diagnostic accuracy and treatment outcomes.

## Contribution

The novel contribution is a validated machine learning model using EMR data to differentiate MDD from BD, with insights into key predictive features.

## Key findings

- A random forest model achieved an AUC of 0.863 in internal validation and 0.710 in external validation.
- Illness duration, creatine kinase levels, and age of onset were identified as key predictive features.
- The model shows potential as a clinical decision support tool but may misclassify latent bipolar disorder cases.

## Abstract

Misdiagnosing bipolar disorder (BD) as major depressive disorder (MDD) can lead to poor treatment outcomes. This study aims to develop and validate a machine learning-based model to effectively differentiate between MDD and BD using electronic medical record (EMR) data.

This retrospective study enrolled 584 patients with BD and 1,179 patients with MDD from two medical centers between January 2022 and August 2024. Feature selection was performed using both Least Absolute Shrinkage and Selection Operator (LASSO) regression and the Boruta algorithm. Six machine learning (ML) algorithms were used to construct the model. SHapley Additive exPlanations (SHAP) analysis was conducted to improve model interpretability.

Among the six machine learning models constructed based on these features, the RF model demonstrated the best overall performance, achieving the highest (AUC = 0.863) in the internal validation set and also showing moderate discriminative ability (AUC = 0.710) in the external validation set. SHAP analysis identified illness duration, creatine kinase, and age of onset as the most important predictive features.

The 7-predictor RF model demonstrated moderate discriminative performance, showing potential as an auxiliary decision support tool for distinguishing hospitalized BD from MDD patients. However, the reliance on the International Classification of Diseases, 10th Revision (ICD-10) criteria may result in the misclassification of latent BD, thereby limiting the model’s accuracy and generalizability. Furthermore, future work should validate the model’s generalization ability in multi-center samples and develop intuitive decision support tools to enhance clinical utility.

Not applicable. This is a retrospective observational study that does not involve any clinical intervention; therefore, clinical trial registration was not required.

The online version contains supplementary material available at 10.1186/s12888-026-07844-1.

## Linked entities

- **Diseases:** bipolar disorder (MONDO:0004985), major depressive disorder (MONDO:0002009)

## Full-text entities

- **Diseases:** bipolar disorder (MESH:D001714), major depressive disorder (MESH:D003865)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12924540/full.md

---
Source: https://tomesphere.com/paper/PMC12924540