# Deep learning ensemble models for CT-based differentiation of malignant and benign sacral bone tumors: development and evaluation

**Authors:** Ping Yin, Fei Zheng, Ke Liu, Kewei Liang, Li Yang, Lin Lu, Ning Lang, Yongmei Li, Nan Hong

PMC · DOI: 10.1186/s13244-026-02220-9 · Insights into Imaging · 2026-03-03

## TL;DR

This study develops an AI model that helps radiologists distinguish between benign and malignant sacral tumors using CT scans, improving diagnostic accuracy, especially for less experienced radiologists.

## Contribution

The first AI-radiologist ensemble model for noncontrast CT-based sacral tumor classification that improves diagnostic performance across all experience levels.

## Key findings

- The ensemble model achieved AUCs of 0.9139 (internal) and 0.8713 (external), with notable improvements in junior radiologists' performance.
- The model reduces dependency on contrast or MRI for diagnosis in musculoskeletal oncology.
- All radiologists showed enhanced diagnostic accuracy, sensitivity, and specificity when using the model.

## Abstract

Radiologists often face challenges in differentiating benign from malignant sacral bone lesions due to their similar imaging characteristics. This study aimed to develop an ensemble deep learning (DL) model that can preoperatively distinguish between benign and malignant sacral tumors using noncontrast computed tomography images.

Preoperative sacral CT scans from 569 patients with confirmed sacral lesions were analyzed. Data from Center 1 were utilized in model development and internal test via fivefold cross-validation, and those from Centers 2 and 3 were employed in external test. Various ensemble models combining human-readable interpretation and DL were developed. The diagnostic performance of the models and radiologists was assessed using metrics such as precision, recall, accuracy, area under the curve (AUC), F1 score, and confusion matrix. Furthermore, the clinical benefits derived from radiologists’ interpretations and supported by the DL model were evaluated.

The ensemble model, which integrates 3D-DenseNet121 with human interpretation, exhibited the most robust performance. The ensemble model demonstrated high performance on the internal and external test sets and achieved AUCs of 0.9139 and 0.8713, F1 scores of 0.9054 and 0.8571, precision of 0.9041 and 0.8824, recall of 0.9136 and 0.8333, and accuracy of 0.8630 and 0.8182, respectively. Across the external test cohort, all radiologists experienced improvements in AUC, accuracy, sensitivity, and specificity. Notably, junior radiologists demonstrated significant improvements compared with senior radiologists.

The potential clinical application of the DL model lies in its capacity to considerably enhance the diagnostic efficiency of radiologists.

This study presents the first ensemble deep learning model integrating 3D-DenseNet121 with radiologists’ interpretation for preoperative differentiation of sacral tumors on noncontrast CT that improved diagnostic performance across all experience levels, particularly for junior radiologists.

First artificial intelligence–radiologist ensemble for noncontrast computed tomography (NCCT)-based sacral tumor classification.Boosts all radiologists’ performance, with the greatest gains for juniors, potentially reducing referrals.Enables reliable NCCT diagnosis, overcoming contrast/magnetic resonance imaging dependency in musculoskeletal oncology.

First artificial intelligence–radiologist ensemble for noncontrast computed tomography (NCCT)-based sacral tumor classification.

Boosts all radiologists’ performance, with the greatest gains for juniors, potentially reducing referrals.

Enables reliable NCCT diagnosis, overcoming contrast/magnetic resonance imaging dependency in musculoskeletal oncology.

## Full-text entities

- **Diseases:** metastatic (MESH:D000092182), anxiety (MESH:D001007), benign sacral tumors (MESH:D009369), calcification (MESH:D002114), sacral lesions (MESH:C537221), Bone tumors (MESH:D001859), pelvic and sacral tumor (MESH:D010386), low back pain (MESH:D017116), neurological deficits (MESH:D009461), chordoma (MESH:D002817), sarcomas (MESH:D012509), bleeding tendency (MESH:C536965), DL (MESH:D007859), giant cell tumor (MESH:D005870), bone lesions (MESH:D001847), Ewing sarcoma (MESH:D012512), death (MESH:D003643), NCCT (MESH:C000719218)
- **Chemicals:** DCA (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12957694/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12957694/full.md

---
Source: https://tomesphere.com/paper/PMC12957694