# A clinically applicable and generalizable deep learning model for anterior mediastinal tumors in CT images across multiple institutions

**Authors:** Chihiro Takemura, Mototaka Miyake, Kazuma Kobayashi, Hiromi Matsumoto, Ryota Shibaki, Atsushi Urikura, Yasushi Goto, Yasushi Yatabe, Shun-ichi Watanabe, Miyuki Sone, Masahiko Kusumoto, Ryuji Hamamoto, Hirokazu Watanabe

PMC · DOI: 10.1038/s41598-026-37504-z · Scientific Reports · 2026-01-30

## TL;DR

This paper introduces a deep learning model that accurately detects and segments rare anterior mediastinal tumors in CT scans, working well across many hospitals.

## Contribution

The model is the first clinically applicable and generalizable deep learning system for rare anterior mediastinal tumors tested across 121 institutions.

## Key findings

- The model achieved an average Dice score of 0.82 and Recall of 0.82 for tumor segmentation in CT images.
- It maintained high sensitivity (0.87) with low false positives (0.61 per scan) even under strict thresholds.
- The model demonstrated strong generalizability across 121 unique institutions not involved in training.

## Abstract

Rare diseases are often difficult to diagnose, and their scarcity also makes it challenging to develop deep learning models for them due to limited large-scale datasets. Anterior mediastinal tumors—including thymoma and thymic carcinoma—represent such rare entities. A few diagnostic support systems for these tumors have been proposed; however, no prior studies have tested them across multiple institutions, and clinically applicable and generalizable models remain lacking. A total of 711 computed tomography (CT) images were collected from 136 hospitals, each from a different patient with pathologically proven anterior mediastinal tumors (339 males, 372 females). Of these, 485 images were used for training, 62 for tuning, and 164 for external testing. The external testing dataset comprised CT images from 121 unique institutions not involved in the other datasets. A 3D U-Net-based model was trained on the training dataset, and the model with the best performance on the tuning dataset was selected. This model was then evaluated on the external testing dataset for its segmentation and detection performance across different institutions. Based on the reference standards provided by board-certified diagnostic radiologists, the trained model achieved average Dice scores of 0.82, Intersection over Union (IoU) of 0.72, Precision of 0.85, and Recall of 0.82 for tumor segmentation at the CT-image level. The free-response receiver operating characteristic curve—derived from lesion-wise IoU thresholds—demonstrated high sensitivity and a low false-positive rate for tumor detection. Even under a stricter IoU threshold of 0.50, the model maintained a sensitivity of 0.87 with only 0.61 false positives per scan. Our model achieved clinically applicable segmentation and detection performance for anterior mediastinal tumors, demonstrating broad generalizability across 121 institutions and overcoming the data-scarcity challenges inherent to such rare diseases.

## Linked entities

- **Diseases:** thymoma (MONDO:0006456), thymic carcinoma (MONDO:0006451)

## Full-text entities

- **Diseases:** tumor (MESH:D009369), Anterior mediastinal tumors (MESH:D008479), thymic carcinoma (MESH:D013945)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12913931/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12913931/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC12913931/full.md

---
Source: https://tomesphere.com/paper/PMC12913931